From a sketch to digital art with Stable Diffusion

started as a sketch in my notebook

I've seen folks [myself included] user a digital drawing from MS paint as a base image for img2img digital art creation.

I wanted to see if there was a workflow to get my sketches from my physical drawing pad converted in to digital art.

Here is the workflow in a nutshell I am thinking of:

Yes, you are getting a slide in this post

The raw drawing:

Taking a picture of the drawing and then editing it down with colors in Gimp.

screenshot from Gimp

At first I was getting some weird runtime errors:

RuntimeError: Invalid buffer size: 16.00 GB

I took a wild guess and figured it was due to exporting the image at 1024x1024 pixels as every time I try to go any larger than 512x512 the scripts fail on the Mac M1. So, I resized the image down to 512x512, and it started working.

One major thing I am learning is that I probably will need to update my outdated home desktop computer's GPU soon if I want to keep progressing. The M1 MacBook works great, but I can see that I will be hitting its limits soon.

Here we go:

python scripts/img2img.py \
  --prompt "a portrait of a small stone obelisk in a gold circle ring, realistic, artstation trends, concept art, highly detailed, intricate, sharp focus, digital art, 8k" \
  --init-img inputs/rune_base_image-512x512.png \
  --outdir outputs/img2img-samples/ \
  --strength 0.8 --ddim_steps 30 --n_iter 1 --n_samples 1

Output:

Holy frack it worked!

Now being the curious type, I wondered if the source image was guiding the model, or was it just doing whatever it thought I wanted to see?

So I did a test of text 2 image script with same prompt:

python scripts/txt2img.py \
  --prompt "a portrait of a small stone obelisk in a gold circle ring, realistic, artstation trends, concept art, highly detailed, intricate, sharp focus, digital art, 8k" \
  --n_samples 1 --n_iter 1 --plms

Output:

hmmmmmmm ok, but looks kinda Sus.

I decided to iterate on the first output and see if we could get some finer details.

Copy it to the input directory and update the command as follows

python scripts/img2img.py \
  --prompt "a portrait of a small stone obelisk in a gold circle ring, realistic, artstation trends, concept art, highly detailed, intricate, sharp focus, digital art, 8k" \
  --init-img inputs/grid-0003.png \
  --outdir outputs/img2img-samples/ \
  --strength 0.8 --ddim_steps 30 --n_iter 1 --n_samples 1

Output:

This popped out and it has me thinking I may refine the prompt to something better.

Let's change the prompt a bit by seeing what others have used in the past.
There is a great website to look up what others have used in art prompting:

https://lexica.art

This prompt seems OK:

https://lexica.art/prompt/f1416101-3da5-453c-b426-8c633efb2c0b

runestone, nature, focused, centered, very detailed, norse, oil painting

Change up the script and see what happens?

python scripts/img2img.py \
  --prompt "a portrait of a small stone runestone in a gold circle ring, norse, realistic, artstation trends, concept art, highly detailed, intricate, sharp focus, digital art, 8k" \
  --init-img inputs/grid-0003.png \
  --outdir outputs/img2img-samples/ \
  --strength 0.8 --ddim_steps 30 --n_iter 1 --n_samples 1

Looks pretty good.

Taking my original drawing and run it with the new prompt to see what we get:

python scripts/img2img.py \
  --prompt "a portrait of a small stone runestone in a gold circle ring, norse, realistic, artstation trends, concept art, highly detailed, intricate, sharp focus, digital art, 8k" \
  --init-img inputs/rune_base_image-512x512.png \
  --outdir outputs/img2img-samples/ \
  --strength 0.8 --ddim_steps 30 --n_iter 1 --n_samples 1

Final Thoughts

I think I will explore more on this topic later on as this is a fascinating way to work. It still looks like the art prompt is really the way to go if you can describe what you want in the prompt. You can direct the composition a bit by making some sketches, but I am not sure if the workflow is worth the extra steps?

Time will tell.

UPDATED

I ran the prompts through MidJourney and here are the results:

Just the art prompt produced this:

a portrait of a small stone runestone in a gold circle ring, norse, realistic, artstation trends, concept art, highly detailed, intricate, sharp focus, digital art, 8k

WOW!

And with the image as weighted 5, seems to need something....

From a sketch to digital art with Stable Diffusion

Final Thoughts

UPDATED

Read next

A Handy Trick for Customizing Nautilus File Columns in Linux

Imagine Server: Building a Web-Based AI Image Generator with Flask and ComfyUI

Leveling Up My Ghost Blog Sync: Now with Added Bluesky Power!