Hi there, I'm curious to know other people's approach in working with Stable Diffusion. I'm just a hobbyist myself and work on creating images to illustrate the fictional worlds I'm building for fun.
However, I find that getting very specific images (that are still visually pleasing) is really difficult.
So, how do you approach it? Are you trying to "force" your imagined picture out by making use of control net, inpainting and img2img? I find that this approach usually leeds to the exact image composition I'm after but will yield completely ugly pictures. Even after hours of inpainting the best I can get to is "sorta ok'ish", surely far away from "stunning". I played around with control net for dozens of hours already, experimenting with multi-control, weighting, control net only in parts of the image, different starting and ending steps, ... but it's only kinda getting there.
Now, opposed to that, a few prompts can generate really stunning images, but they will usually only vaguely resemble what I had in mind (if it's anything else than a person in a generic pose). Composing an image by only prompts is by no means easier/faster than the more direct approach mentioned above. And I seem to always arrive at a point where the "prompt breaks". Don't know how to describe this, but in my experience when I'm getting too specific in prompting, the resulting image will suddenly become ugly (like architecture that is too closely described in the prompt having all wrong angles suddenly).
So, how to you approach image generation? Do you give a few prompts and see what SD can spit out with that? Taking delight in the unexpected results and explore visual styles more than specific image compositions? Or are you trying to be stubborn like me and want to use it as a tool for illustrating imagination - which at the latter it doesn't seem nearly as good at as at the former.
I usually set ut with a vague goal in mind, and play around from there. Often inspired by something seen on civitai, reddit, etc, and then explore variations. Basing it on others' prompts is very useful for learning, as you will quickly know if you have the "knobs" in sensible positions to start with. If it's nowhere close to the reference output, you can figure out where you're going wrong.
Then, once you hit something that inspires you, dial in the prompt by keeping the seed the same. Dial in the cfg and clip skip by doing x/y matrix and see where the interesting areas are. Once all of them are nailed in, let it iterate over seeds to your heart's content.