Local Fun with Stable Diffusion ---Revisited

Give local installs a chance.

Local Fun with Stable Diffusion ---Revisited

In my last post on Stable Diffusion, I was left a bit underwhelmed with the product of my art prompts. One thing that had me questioning was why there are two  different checkpoint files on the HuggingFace repo to download? And did the smaller one just not have enough fidelity to produce great images?

Link to HuggingFace.com repo

CompVis/stable-diffusion-v-1-4-original · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I was wondering what sd-v1-4-full-ema.ckpt means, so I once again got the Mac M1 repo and all the Python things sorted via the helpful Replicate tutorial:

Run
How

Instead of downloading sd-v1-4.ckpt, renaming it model.ckpt, I grabbed the 'full ema' version which was 7.72GB! Which may explain the results below as the main checkpoint is 4.28GB.

The results were a LOT better as you can see below.

They really are a better starting point for future iteration on most of them, the Porsche 911 on a beach image has me second guessing if it was just a picture taken off the internet somewhere?

Red Space Apple

python scripts/txt2img.py \
  --prompt "a red juicy apple floating in outer space, like a planet" \
  --n_samples 1 --n_iter 1 --plms

Business Cat

python scripts/txt2img.py \
  --prompt "a cat wearing a suit and tie with green eyes, a stock photo by Hanns Katz, pexels, furry art, stockphoto, creative commons attribution, quantum wavetracing" \
  --n_samples 1 --n_iter 1 --plms

Bender

python scripts/txt2img.py \
  --prompt "bender from futurama fishing in the woods with a cigar" \
  --n_samples 1 --n_iter 1 --plms

Porsche Cayman in the rain

python scripts/txt2img.py \
  --prompt "porsche cayman electric concept car driving on a mountain road in the rain futuristic, ultra high detail, cinematic, unreal engine 5, octane render" \
  --n_samples 1 --n_iter 1 --plms

Porsche 911 on a beach

python scripts/txt2img.py \
  --prompt "porsche 911 parked on a beach in the style of a 70s science fiction novel cover" \
  --n_samples 1 --n_iter 1 --plms

Muppets King

  python scripts/txt2img.py \
  --prompt "king of the muppets ruling the universe" \
  --n_samples 1 --n_iter 1 --plms

Fire Dragon

python scripts/txt2img.py \
  --prompt "fire dragon breathing fire, black volcano background, fire particles, wallpaper" \
  --n_samples 1 --n_iter 1 --plms

Deafmice Avatar

python scripts/txt2img.py \
  --prompt "A deaf mouse anthropomorphic engineer in the orbit, jet pack, ink+concept art+line art, manga cover art + dragon ball style, style of Doraemon, by Toriyama Akira" \
  --n_samples 1 --n_iter 1 --plms

Final Thoughts

I am really happy I took the time to reload Stable Diffusion and use the bigger checkpoint file as it provided drastically better results. Now, to learn the art of art prompting.