StableDiffusion

7 readers
1 users here now

For discussions around Stable Diffusion, a text-to-image generative AI model. Share your generated pictures, discuss the various UI and extensions, share news about releases, bring tutorials and more!

founded 1 year ago
1
 
 

So I work at a home improvement store, and one of my co-workers does some contracting work on the side. He is trying to encourage one of his neighbors to put some simple small-park kind of stuff on a plot of land he owns so that he (my co-worker) can pick up some extra business installing it.

He's seen me messing around with Stable Diffusion on some web apps at work on my down time, and he asked if it was possible if I could take a photo of the site and use AI to insert some of these elements into it so that he could show it to this potential client and maybe sell it to him that way.

"Sure," I said, thinking to myself, 'I can just use inpainting to blend this stuff into the image pretty seamlessly. Easy-peasy.'

It took me almost a full day of on-again, off-again work to get a picnic table I could live with. But I CANNOT get any model, any prompt, anything to make a swing set that I can live with. I've been pecking away at this problem for several days now, and every single attempt at a swing set has resulted in something that is mangled, twisted, or some terrible hybrid of OTHER playground equipment that my co-worker definitely doesn't want in the scene.

At this point I'm just working on it for the challenge, but I admit that I'm stumped. Short of training my own Lora, does anyone have any advice on how to make a coherent swing set to bring into this image? >_< Yes, this is a silly problem to have, I admit that, but I've also learned a great deal about how Stable Diffusion 'thinks' in the last few days, so I consider it a learning experience. Still, does anyone else have any ideas?

2
 
 

cross-posted from: https://lemmy.dbzer0.com/post/1352467

These have all been created with the base SDXL release + refiner in Automatic1111's using the extension to add the "refiner" found (https://github.com/wcde/sd-webui-refiner)

1216x832 resolution with Hi-Res Fix at 2x using the Siax-200k upscaler and DPM 2 a Karras sampler.

Here's the dynamic prompt. You need to install the dynamic prompt extension from (https://github.com/adieyal/sd-dynamic-prompts)

This is the prompt I used


{cinematic shot | establishing shot | intimate scene | sweeping grandeur},(21:9 aspect ratio)

{(ultrawide panoramic cinematic aspect ratio)}

{Blender | Photoshop|Octane render|Unreal Engine 5|8K RAW Photo} ,  

{2-4$$lone twisted tree | winding river| mountain peak| crumbling ruins| abandoned cabin|wooden fence | dramatic cliffs | stormy sea | rolling thunder | howling wind | foggy moor | charred forest | broken-down cart| towering dunes | parched canyon | bone-strewn pit | petrified woods| wrecked galleon | beast's den| majestic waterfall | calm lake | moonlit trail  | moss-covered stones | misty vale |ravaged battlefield | derelict mill}   

{cirrus clouds |stormy sky|cumulus clouds|stratus clouds|nimbostratus clouds|cumulonimbus clouds}   

{clear | atmospheric fog | mist | haze | pollution| dust |smoke |atmospheric halo| sun dogs | moon dogs | sun pillars | circumzenithal arcs|circumhorizontal arcs},

{abstracted | concept art| Hyperrealistic| stylized| fantasy| impressionistic | photo| realistic }   

(16K, 32bit color, HDR, masterpiece, ultra quality)   

{brutalist | minimalist| whimsical| retro futurist}   

{muted tones | vibrant hues}  

{warm sunset tones |cool muted blues | colors}   

{natural | warm| dramatic }  

{god rays | sun beams | crepuscular rays| antisolar rays | volumetric light | light pillars | sun pillars | moon pillars},  

{dawn | sunset| night} {clear  | overcast | fog}   

{winter | spring | summer | autumn}   

{ volumetric shadows | volumetric ambiance | aerial perspective | depth fog},       

in the style of  
{1-2$$Dylan Furst  |Ash Thorp | Simon Stålenhag | Bob Ross| Ralph McQuarrie | Syd Mead| Moebius| Daarken| Felix Yoon| Gustave Doré| Arnold Böcklin| William Blake | Frank 
Frazetta| John Constable |J.C. Dahl }   
and
{1-2$$James Gurney | Craig Mullins| Android Jones |Justin Maller | John Berkey| Roger Dean| Rodney Matthews | Chris Foss| Nicolas Roeg | Geoffrey Hayes | John Harris| Dinotopia| Jon Foster| Brom| Brian Froud | Alan Lee},

3
 
 

Tried my hand at cars. I used the Juggernaut model. The prompt uses regional prompter extension and looks like this:

at night city, summer, sweaty ADDCOMM
RAW photo, Nikon Z6 II Mirrorless Camera, hyper realism, extremely detailed, 8k uhd, dslr, soft lighting, high quality, film grain ADDBASE
ADDROW
ADDCOL
Aston Martin zeekars, hotrod, LED, lora:zeekars:.7 ADDROW
Negative prompt: JuggernautNegative-neg, transition of shapes, blurry, (((numbers on door))), duplicates, close range angle, crop, illustration, drawing, painting, sketching, render, artwork, 3d, cgi, logo, text, letterbox, 3D, render, video game, anime, cartoon, sketch, caption, subtitle, signature, watermark, username, artist name

I use SD upscale and ControlNet Tile for upscaling with 4x_NMKD-Siax_200k. zeekars is a Lora.

I hope SDXL will improve in all the small details for mechanical, geometrical forms. At first, the image looks great but when you look closely, panels separation, headlight, tyres look unrealistic...We will see tomorrow!

4
 
 

Just wanted to put this out there for anyone else who was in the same position, as I'd spent some time banging on this to find a functioning combination and would have appreciated having had success reports myself.

Running Debian Trixie, current as of July 22, 2023.

I see 512x512 speeds of about 2.2 it/s, which is significantly slower than an lower-end Nvidia card than I'd used and significantly (about 1/8th the speed) that other people have reported getting the same XT 7900 XTX card running at on Linux), so there is probably more work for me to do, but it's definitely running on the GPU and is much faster than running on the CPU, so I know that this combination (vanilla system Python, vanilla system drivers, torch nightly in a venv) does at least work, which was something that I'd been unsure of up until now.

Running on the host, no Docker containers. Using a venv. Automatic1111 web UI, in-repository drivers, 2.1.0.dev20230715+rocm5.5 torch via pip installed in a venv, standard system Python 3.11 (i.e. did not need to set up Python 3.8, as I've seen some people do). Needs the non-free-firmware apt repo component enabled; I have firmware-amd-graphics-20230515-3. Rocm 5.6 is out as of this writing from AMD, but Debian Trixie presently only has 5.5 packaged and in the repos.

I did need to install libstdc++-13-dev -- only libstdc++-12-dev being installed caused Automatic1111 to bail out with an error in not being able to find a limits C++ header when building some C++ code at runtime; some users had run into a similar error and resolved it by installing libstdc++-12-dev, which was a bit confusing. I have both clang and g++ installed. I am not terribly familiar with the AMD ROCM stack, but my understanding is that part of it (libamdhip64?) performs some compilation at runtime; it apparently remembers the binaries it has compiled, as if I removed libstdc++-13-dev after a successful run, it continued to work.

The user running the Automatic1111 frontend needed to be added to the render and video groups to have access to the requisite device files.

I did not need to have HSA_OVERRIDE_GFX_VERSION set.

As for options being passed in COMMAND_ARGS, just --medvram and --api.

--xformers does not work with AMD cards; Stable Diffusion (or Automatic1111, unsure about responsibility in the stack) apparently just ignores it there; passing it doesn't break anything.

Some --opt-sdp options, like --opt-sdp-attention, cause dramatic slowdown, I assume causing the generation to run on the CPU instead of the GPU. I'd suggest to anyone trying to get a similar environment running to not start including optimization flags until they have things working without them; this had complicated things for me.

I see 2.59 it/s, so something like 20% higher performance, without --medvram being passed to COMMANDLINE_ARGS.

I have not done extensive testing to see whether any issues show up elsewhere with Stable Diffusion.

5
6
 
 

Hi,

For a media project, I need to create dark fantasy themed backgrounds.
It will be heavily inpainted to meet each background needs.

I'm looking for models, loras, styles, examples, tutorials, etc.
Anything you can think around dark fantasy is valuable, please feel free to suggest linked or related subjects or resources.

Thanks for help in advance.

7
 
 

I am really enjoying the model aZovyaPhotoreal_v2 for creating sci-fi space like images.

Generating data

--

parameters

long shot scenic professional photograph of full shot, people, one woman, auburn hair, blue eyes , pilot officers, on the command bridge of a giant space ship, overlooking a nebula, robotech, crew, detailed skin texture, (blush:0.5), (goosebumps:0.5), subsurface scattering
Negative prompt: NSFW, Cleavage, Pubic Hair, Nudity, Naked, Au naturel, Watermark, Text, censored, deformed, bad anatomy, disfigured, poorly drawn face, mutated, extra limb, ugly, poorly drawn hands, missing limb, floating limbs, disconnected limbs, disconnected head, malformed hands, long neck, mutated hands and fingers, bad hands, missing fingers, cropped, worst quality, low quality, mutation, poorly drawn, huge calf, bad hands, fused hand, missing hand, disappearing arms, disappearing thigh, disappearing calf, disappearing legs, missing fingers, fused fingers, abnormal eye proportion, Abnormal hands, abnormal legs, abnormal feet, abnormal fingers
Steps: 25, Sampler: DPM2 a Karras, CFG scale: 7, Seed: 3402089243, Face restoration: CodeFormer, Size: 768x512, Model hash: 5594efef1c, Model: aZovyaPhotoreal_v2, Denoising strength: 0.4, Hires upscale: 1.75, Hires upscaler: None, Version: v1.4.0

postprocessing

Postprocess upscale by: 3, Postprocess upscaler: Nearest, Postprocess upscaler 2: R-ESRGAN 4x+, CodeFormer visibility: 0.324, CodeFormer weight: 0.296

extras

Postprocess upscale by: 3, Postprocess upscaler: Nearest, Postprocess upscaler 2: R-ESRGAN 4x+, CodeFormer visibility: 0.324, CodeFormer weight: 0.296

8
 
 

Full Post Text


In the spirit of full disclosure, the content of this post is heavily cribbed from this post on Reddit. However, as we've seen, the Internet is not forever. It is entirely possible that a wealth of knowledge could be lost at any time due to any number of reasons. Because I have found this particular post so helpful and find myself coming back to it over and over, I thought it would be appropriate to share this method of creating an inpainting model from any custom stable diffusion model.

Inpainting models, like the name suggests, are specialized models that excel at "filling in" or replacing sections of an image. They're especially good at decoding what a section of an image should look like based on the section of image that already exists. This is useful when you're generating images and only a small section needs to be corrected, or if you're trying to add something specific to an image that exists.

So how is this done? With a model merge. Automatic1111 has an excellent model merging tool that we'll use. Let's assume that you have a custom model called my-awesome-model.ckpt that is based on stable-diffusion-1.5.

In the A1111 Checkpoint Merger interface, follow these steps:

  1. Set "Primary model (A)" to stable-diffusion-1.5-inpainting.ckpt.

  2. Set "Secondary model (B)" to my-awesome-model.ckpt.

  3. Set "Tertiary model (C)" to stable-diffusion-1.5.ckpt.

  4. Set "Multiplier (M)" to 1.

  5. Set "Interpolation Method" to Add difference.

  6. Give your model a name in the "Custom Name" field, such as my-awesome-model-inpainting.ckpt.

    • Adding "-inpainting" will signal to A1111 that the model is an inpainting model. This is useful for extensions such as openOutpaint. Also, it's just a good idea to properly label your models. Because we know you're a degenerate that has hundreds of custom waifu models downloaded from CivitAI.
  7. Click Merge.

And bazinga! You have your own custom inpainting stable diffusion model. Thanks again to /u/MindInTheDigits for sharing the process.

9
 
 

This is a Mastodon thread I created featuring devices and computers that never were, but could have been. I think my favorite might be the HD Laserdisc player called the MOID.

https://mastodon.jordanwages.com/system/media_attachments/files/110/686/620/475/187/344/original/95ff43d76ba41dd6.jpg

10
 
 
11
5
About to arrive (media.kbin.social)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
 

Learning to inpaint and messing around with a steampunk lycoris

12
 
 

The reference family of preprocessors in ControlNet allow you to quickly transfer a style from an image in txt2img or img2img mode. I will demonstrate how to use it in txt2img.

  • download this image. I chose the 853x1280 size. We will use it as our ControlNet Reference image.

  • setup your ControlNet tab like this.
    ControlNet setup
    If you wonder what is the difference between the reference preprocessors, you can find more info here

-add a prompt close to the style of your image. Very important, set the CFG to a low value (3,4). Otherwise, your result will look very dark and overcooked. This is my settings:
text2img settings

Try to keep the aspect ratio of the original image but do not hesitate to experiment.

13
 
 
14
 
 
15
 
 

I thought I would share my experience this evening with the group here seeing that I still excited as hell for getting this hodgepodge to work at all.

I have been playing with the machineMl version of stable diffusion on Windows for a while now (we won't go into the reasons why, but I have the 6800XT, which is not well suited to this use case)

Automatic11111 on MachineML is dog slow, but I got some interesting results. So today I cleared out an old SSD, wired it up and installed a clean ubuntu. Following this guide I managed to get rocM running and the difference is like chalk and cheese. Or rather Impulse and Warp drive. Totally crazy!

So for you AMD Radeon users out there. There is hope. It is in Linux, but it is there.

16
17
 
 
18
 
 
19
 
 
20
 
 
21
 
 
22
 
 

Midjourney v5.2 features camera-like zoom control over framing, more realism.

23
 
 

AI Art News & Culture by the people, for the people. Click to read Stable Digest, by Stable Foundation, a Substack publication with tens of thousands of readers.

24
6
Trio of Gamers (media.kbin.social)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
 

Random illustration for a story idea; the story idea may not pan out, but I was proud of how the art came out!

Steps: 40, Sampler: Euler a, CFG scale: 10, Seed: 900795974, Size: 1536x1536, Model hash: 25ba966c5d, Model: aZovyaRPGArtistTools_v3, Denoising strength: 0.3, Clip skip: 2, Token merging ratio: 0.5, Ultimate SD upscale upscaler: 4x-AnimeSharp, Ultimate SD upscale tile_width: 512, Ultimate SD upscale tile_height: 512, Ultimate SD upscale mask_blur: 8, Ultimate SD upscale padding: 32, Version: v1.3.2

25
 
 

I hope it’s the open release of SDXL - the beta on the Stable Diffusion discord is getting pretty impressive.

In any case, I’d like a tea serving drone 😁.

view more: next ›