How NOT to use Stable Diffusion

 How NOT to use Stable Diffusion

Introduction

Stable Diffusion is one of the most popular AI models for image generation. It is open source and runs on consumer hardware. It is one of the most customizable models with many tools and addons available to the creators. One can fine tune it to their specific need or use one of the numerous fine tuned models published online. Despite all this, it is one of the most frustrating tools to use. The images can be blurry, inaccurate, and inconsistent. In this blog we will look into how NOT to use stable diffusion.

How NOT to use Stable Diffusion

These are some things NOT to do when using Stable Diffusion

  1. Using too general or vague prompts

  2. Using prompts that are too complex or detailed

  3. Using prompts that are offensive or harmful

  4. Using prompts that have too many subjects

  5. Using output as a final product

1. Using too general or vague prompts


Stable diffusion is a powerful tool but it’s not magic. It will not understand vague prompts. You need to be more specific in the prompt. If you just say “a cat”, it might generate a general image that looks like a cat, not necessarily what you want. You can be more specific like “a tabby cat sitting on a table” and you'll get better results.


"a cat"

“a tabby cat sitting on a table”

As you can see in the above example, a simple vague prompt like “a cat” was not followed properly. It generated multiple cats even though I asked for a cat. But when I was more specific it generated a better image.

2. Using prompts that are too complex or detailed

If your prompt is too complex then stable diffusion will struggle to generate the intended image. Stable diffusion gives more weight to the words that are at the beginning and less weight to the words that are at the end. If your prompt is “a picture of an adorable curled up tabby cat with green eyes is sleeping soundly on a couch in front of a bright fireplace”, then the words like “of” and “on” get unnecessary importance. A simplified compact prompt like “cat sitting on a couch, fireplace in background” will generate a better image.


“a picture of an adorable curled up tabby cat which is sleeping soundly on a couch in front of a bright fireplace”

 
“cat sleeping on a couch, bright fireplace in background, adorable, curled up”

As you can see in the above example, stable diffusion did not follow the complex prompt properly. The cat is not sleeping, the fireplace is not visible properly, and the fireplace is not bright either. But on the other hand, the simplified prompt gave us a much better result.

3. Using prompts that are offensive or harmful

The prompt that can be interpreted as offensive might generate an undesirable output. This is the reason why a lot of online services don't allow certain words in the prompt. It's important to be aware of the potential consequences of using Stable Diffusion in this way. There can be legal consequences for publishing an offensive image. Many users prefer to run Stable Diffusion locally to avoid false positives in the detection of offensive content.

4. Using prompts that have too many subjects

If there are too many things in the prompt, Stable Diffusion won't include everything in the output image. It is not consistent on what subject will be omitted from the output image.Or some things will end up in unusual places. Let’s see with an example


“cat, couch, fireplace, flower vase, window, shelf, book, sleeping”


“cat, couch, fireplace, flower vase, book shelf, sleeping”

As you can see in the above example, The prompt with too many objects was not properly followed. There is no fireplace and a flower vase ended up in the book shelf. On the other hand, the prompt with less objects was better followed. All the words in the prompt ended up in the output. 

However, as you might have noticed, even though the second image has everything we asked for, it’s not consistent. The couch is right next to the fireplace and facing away from the fireplace. The couch is also blocking half of the fireplace and the bookshelf door. It is not something you see in the real world. But the intention here was to get all these things in one picture and it was achieved. 


Using more words will make the image more consistent with the real world, but omit some things. And using a compact prompt will give you all the things but it might not be practical.

5. Using output as a final product

Although you might want to use Stable Diffusion output as a final product, it is important to be aware of the limitations of the model. Stable Diffusion outputs often have inconsistencies and imperfections. When used commercially, these imperfections can make the audience disappointed. The outputs need to be post processed to fix these problems until it’s ready for practical use. You need to know how to use tools such as Photoshop or GIMP to edit the final image. 

In some cases you might only need a part of the image, in which case you can cut out the image into a png image of the object. 

Conclusion

Generating a satisfying output from Stable Diffusion is a challenging task. Hopefully this blog helps you avoid some common prompting mistakes. AI doesn’t know what a beautiful image is. You need to describe it in a beautiful way. Remember that Stable Diffusion is just a tool and you need to have knowledge of the tools needed to make an image that appeals to your audience. And don’t be afraid to experiment with Stable Diffusion and get the best suited creations to your needs.


Comments