formefert.blogg.se - Cluster truck

In general, results are better the more steps you use, however the more steps, the longer the generation takes. You can change the number of inference steps using the num_inference_steps argument. Image = pipe(prompt, guidance_scale= 7.5, generator=generator).images Generator = torch.Generator( "cuda").manual_seed( 1024) If you want deterministic output you can seed a random seed and pass a generator to the pipeline.Įvery time you use a generator with the same seed you'll get the same image output. Let's see what they look like: result = pipe(prompt) In fact, the model predictions include information about whether NSFW was detected for a particular result. If you believe this shouldn't be the case, try tweaking your prompt or using a different seed. If at some point you get a black image, it may be because the content filter built inside the model might have detected an NSFW result. The previous code will give you a different image every time you run it. # you can save the image with # image.save(f"astronaut_rides_horse.png") The use of tocast("cuda") is necessary when using half ( float16) precision, and recommended for full precision as it will make inference faster: prompt = "a photograph of an astronaut riding a horse" with tocast( "cuda"): To run the pipeline, simply define the prompt and call pipe. Pipe = om_pretrained( "CompVis/stable-diffusion-v1-4", revision= "fp16", torch_dtype=torch.float16, use_auth_token=YOUR_TOKEN) Weights to be in float16 precision: import torchįrom diffusers import StableDiffusionPipeline You can do so by loading the weights from the fp16 branch and by telling diffusers to expect the Make sure to load the StableDiffusionPipeline in float16 precision instead of the default Note: If you are limited by GPU memory and have less than 10GB of GPU RAM available, please If a GPU is available, let's move it to one! pipe.to( "cuda") Pipe = om_pretrained( "CompVis/stable-diffusion-v1-4", use_auth_token=YOUR_TOKEN) from diffusers import StableDiffusionPipeline The pipeline sets up everything you need to generate images from text withĪ simple from_pretrained function call. The Stable Diffusion model can be run in inference with just a couple of lines using the StableDiffusionPipeline pipeline. Once you have requested access, make sure to pass your user token as: YOUR_TOKEN= "/your/huggingface/hub/token"Īfter that one-time setup out of the way, we can proceed with Stable Diffusion inference. For more information on access tokens, please refer to this section of the documentation. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. In this post we'll use model version v1-4, so you'll need to visit its card, read the license and tick the checkbox if you agree.

If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users.įirst, you should install diffusers=0.3.0 to run the following code snippets: pip install diffusers=0.3.0 transformers scipy ftfy You may re-distribute the weights and use the model commercially and/or as a service.We claim no rights on the outputs you generate, you are free to use them and are accountable for their use which should not go against the provisions set in the license, and.You can't use the model to deliberately produce nor share illegal or harmful outputs or content,.We request users to read the license entirely and carefully. The license is designed to mitigate the potential harmful effects of such a powerful machine learning system.

Now, let's get started by generating some images 🎨.īefore using the model, you need to accept the model license in order to download and use the weights. Models are completely new to you, we recommend reading one of the following blog posts: Note: It is highly recommended to have a basic understanding of how diffusion models work. One to customize the image generation pipeline. In this post, we want to show how to use Stable Diffusion with the 🧨 Diffusers library, explain how the model works and finally dive a bit deeper into how diffusers allows LAION-5B is the largest, freely accessible multi-modal dataset that currently exists. It is trained on 512x512 images from a subset of the LAION-5B database. Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from CompVis, Stability AI and LAION. Patrickvonplaten Patrick von Platen Stable Diffusion 🎨