The images you generate will not be perfect, even if you do everything right. Guidelines for creating better prompts and getting better results. Original 1girl bangs blue_eyes blush long_hair looking_at_viewer open_mouth smile solo The TLDR is that it allows you to generate images from Danbooru tags, a generic anime girl can for example be generated with the following prompt: To generate anime-style visuals specifically the use of the Waifu Diffusion finetune is recommended. Prompt 4 (anime girl), cherrypicked result (best of 9). To make the resulting shortstacks look more like goblins the prompt can be modified to for example specify green skin. The trick of this prompt is to avoid the association with goblins and to instead go with "halfling" which has much more aesthetic-looking examples in the dataset. Shortstacks are frequently associated with goblins but the training data contains a lot of examples for "goblin girl" that are just plain ugly.Īnd because diffusion models are learning to create images that resemble the data, any prompt that uses "goblin girl" will produce similarly ugly-looking goblin girls. Photorealistic render of a plump halfling maiden cooking in the kitchen, pixar, disney, elf ears, pointed ears, big eyes, large cleavage, busty, cute, adorable, artists portrait, fantasy, highly detailed, digital painting, concept art, sharp focus, depth of field blur, illustration Prompt to generate non-hideous shortstacks: Hands are deformed, this is a very common problem with images generated by convolutional neural networks. Prompt 3 (shortstack), cherrypicked result (best of 9). There are no seeds that are inherently better than others but if you vary your input parameters only slightly seeds that produced good results previously will likely still produce good results. Keep this the same to generate the same (or almost the same) images multiple times. Height: same as Width but for individual image height.Image coherence on large scales (model was trained on 512x512) becomes worse as the resolution increases. To increase this value you need more VRAM. Width: width of individual images in pixel.Can be somewhat counteracted with more sampling steps. Increasing this value will result in images that resemble your prompt more closely (according to the model) but it also degrades image quality after a certain point. CFG Scale (classifier-free guidance scale): how strongly the images match your prompt.Total number of images is this multiplied with batch count. Increasing this value can improve performance but you also need more VRAM. Batch size: how many images to generate at the same time.Batch count/n_iter: how often to generate a set of images.ddim_eta: amount of randomness when using DDIM.The non-ancestral samplers all produce mostly the same images as the number of steps increases, use LMS if you're unsure. Euler a (short for ancestral) produces great variety with a low number of steps, but it's very difficult to make small adjustments. Sampling method: which sampler to use.Increasing this directly increases the time needed to generate images. More steps means smaller, more precise steps from noise to image. This is how many such steps should be done. Sampling Steps: diffusion models work by making small steps from random Gaussian noise towards an image that fits the prompt.Negative prompt: textual description of things that you don't want in the image.Prompt: textual description of what you want to generate.The names for the input parameters when generating images differ slightly from interface to interface but they generally do the same thing.īelow is a brief explanation of what the parameters do: Be aware that the collection is barely curated so take anything linked there with a grain of salt. Stable Diffusion Akashic Records large collection of links to useful SD-related resources.A recipe for prompt design can be found below. Waifu Diffusion finetune: finetune of Stable Diffusion on Danbooru tags, significantly improved visuals for anime-style images. Despite the name the sd-webui a.k.a hlky fork was actually forked from AUTOMATIC1111. A non-graphical frontend is in early development. Installation instructions that attempt to be more noob friendly than the Github page can be found here. It has a lot of cutting-edge features like negative prompts or attention via parentheses.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |