How To Use Stable Diffusion to Generate Stunning AI Images

Discover the minimum requirements and recommended specs for running Stable Diffusion. Learn how to set up and troubleshoot this powerful AI tool.

The Upwork Team

Published

Aug 21, 2024

The Upwork Team

Published

Aug 21, 2024

Generative artificial intelligence (AI) is revolutionizing numerous sectors and enabling individuals to produce different kinds of content including code, images, videos, text, and synthetic data from text prompts.

AI-powered tools like DALL-E 3, Midjourney, and Stable Diffusion are trained on vast amounts of data, allowing them to process and interpret human language and produce meaningful and relevant content.

Stable Diffusion is a form of generative AI customized and fine-tuned specifically for image generation. From clear text descriptions, Stable Diffusion can produce high-quality images in a variety of styles. This platform uses the stable diffusion process, which allows it to encode text prompts and use latent representation to produce desired outputs.

In this step-by-step tutorial, we’ll uncover how to use Stable Diffusion for AI art creation and photorealistic image generation.

What is Stable Diffusion?

Developed by Stability AI, Stable Diffusion is a deep learning model capable of generating new data based on its training data. Individuals can use Stable Diffusion to create images from text prompts or other images.

Stable Diffusion is open-source, meaning users can access and modify the code and use it for commercial or noncommercial purposes.

There are multiple ways of using Stable Diffusion—with the easiest method being via its web platform. Another technique involves cloning the Stable Diffusion repository from GitHub and installing it locally on your computer, which is our focus in this article.

How to set up Stable Diffusion

Installing Stable Diffusion on a personal computer gives you more control over the image generation process. Your data will stay on your machines, increasing your privacy and your control over the hardware and software. You also aren’t dependent on a third party maintaining uptime on its own systems. Stable Diffusion’s functionality heavily relies on your hardware setup.

When installing Stable Diffusion, it’s important to meet specific hardware requirements. You’ll need a robust CPU, at least 4GB of VRAM, and an SSD to ensure smooth operation. If you’re using a Mac or a Linux system, you may need to verify compatibility with your hardware, particularly if you’re working with an Intel processor or an RTX 3060 GPU. If your equipment doesn’t meet these requirements, you might consider upgrading before you get started.

Stable Diffusion features an intuitive and user-friendly interface, especially for beginners to AI image generation. However, you’ll need some computing resources for Stable Diffusion to work well. Here are the minimum requirements and recommended specs to install and run Stable Diffusion effectively on your PC.

CPU. A modern multi-core processor to handle the demands of AI tools like Stable Diffusion
RAM storage. At least 10 GB of free storage space on your hard disk
Graphics card. A dedicated graphics card from NVIDIA or AMD
GPU memory. At least 4GB of GPU memory (VRAM).
SSD. An SSD is recommended for faster data processing and retrieval.

If you’re using a mac or a Linux system, you may need to check compatibility with your CPU and GPU, especially when using Intel processors or an RTX 3060 GPU.

Getting started

To set up Stable Diffusion on a personal computer, you must install Python, the model’s user interface, and finally the actual Stable Diffusion model.

Note that Stable Diffusion may work better on Windows PCs than other operating systems because features like DirectX and CUDA—that the model relies on—are supported out of the box.

Let’s start the installation process.

1. Navigate to the official Python website to download Python version 3.12.5. While downloading, ensure the Python version is compatible with your operating system.

2. Once you’ve downloaded the executable file, navigate to where it’s stored and install it. Ensure you add Python to PATH by checking the Add python.exe to PATH checkbox, as shown below.

3. With Python installed, we need to install Git. We will use Git to download the Stable Diffusion UI from Github. Proceed and download, and then install Git (according to your operating system) on your computer.

4. Once Git is installed, we can proceed and download the Stable Diffusion web UI. Note this is not the actual Stable Diffusion model. It merely provides a layer of abstraction between the user and the underlying model—which we will install later.

Navigate to this GitHub repository to download the web UI. Click the code and then Download Zip, as shown below.

Alternatively, you can open Git in your project folder and simply run the “git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git” command to download the Stable Diffusion-web-ui.

5. If you downloaded the repository directly from your browser, you should see the stable-diffusion-webui-master.zip file in your downloads folder. You’ll want to cut and paste this file into your working folder to avoid any confusion. In our case, we moved the file into imagegenerator directory—which serves as our project folder.

6. Unzip or extract the files in your working folder.

7. With the Stable Diffusion UI version installed, we can now proceed to install the Stable Diffusion model itself. You can find multiple versions of the Stable Diffusion model on Hugging Face. However, it's best to download the latest version since it's been trained on more data, allowing it to generate higher quality images.

8. In this tutorial, we will use the stable-diffusion-v-1-4-original version of Stable Diffusion. Scroll down the model web page and click sd-v1-4.ckpt to download the model. Note this model is about 4GB in size, meaning the download process may take some time depending on your internet connection.

9. Once the model has been downloaded, navigate to its storage location and copy the file. Next, open your project folder and locate the models directory. In the models directory, look for the Stable-diffusion folder and paste the model in it.

10. We can now run the project by clicking on the webui-user.bat file in the root project directory, as shown below.

11. When you click the webui-user.bat, a command terminal should pop up showing various dependencies being installed. Once again, this installation process may take some time depending on your computer resources.

12. After the dependencies are installed, a local link will be displayed on the terminal, as shown below. Press the link to open the Stable Diffusion UI in your browser.

13. You should be directed to a browser window showing the following page.

Generating images with Stable Diffusion

Now that we’ve installed Stable Diffusion on a personal computer, let’s go ahead and generate some new images.

We need to write clear and concise text prompts to allow Stable Diffusion to produce images that match what we want.

To add prompts, simply start typing in the prompt field shown below. Once you’ve added a prompt, you can initiate the new image generation process by pressing the adjacent Generate button.

Generating images with Stable Diffusion

Now that we’ve installed Stable Diffusion on a personal computer, let’s go ahead and generate some new images.

We need to write clear and concise text prompts to allow Stable Diffusion to produce images that match what we want.

To add prompts, simply start typing in the prompt field shown below. Once you’ve added a prompt, you can initiate the new image generation process by pressing the adjacent Generate button.

You can also add an optional negative prompt to specify which elements should not be added to the generated images.

Stable Diffusion uses checkpoints to save progress during the image generation process. This allows you to revert to previous stages if needed, ensuring that your generated images align with your expectations.

To illustrate, we used the prompt “a beautiful sunset over the lake” to produce the following images in Stable Diffusion.

We also used the “an oasis in a desert” prompt to produce the following images.

Stable Diffusion offers several settings that we can adjust to influence how images are generated. Here are some of the more important settings.

1. Sampling method. These are different types of algorithms that control how images are generated in Stable Diffusion. Some sampling methods include DDIM, DPM2, Euler, and DPM++ 2M SDE Exponential. Sampling methods can affect the quality of images but also tend to consume computational resources differently.

2. Sampling steps. This refers to the number of steps the Stable Diffusion model runs to generate an image from random noise. Typically, the higher the sampling steps, the better the quality of the images. Our sampling steps are currently 20. We can increase the steps to about 50 to allow the model to process complex text prompts more effectively.

3. Hires.fix. This setting relates to how the Stable Diffusion model produces high-resolution images. When you enable Hires.fix, the model will generate low-resolution images first and then slowly upscale them to high definition. This technique assists in minimizing distortions that may occur when the model attempts to produce HD images directly.

4. Refiner. As the name suggests, this setting allows Stable Diffusion to enhance the quality of generated images. For instance, the Stable Diffusion model can reduce the amount of noise in a rendered image and improve its clarity.

5. Width. This setting allows you to adjust the width of generated images. The width varies from 64 to 2048.

6. Height. Using this setting, you can adjust the image height between 64 to 2048.

7. Batch count. This refers to the number of images the Stable Diffusion model produces simultaneously. For instance, you can set the batch count to 10 to allow the model to produce 10 images simultaneously. However, the higher the batch count, the more VRAM the model consumes. Consider setting the batch count to a low value like one when using a computer with limited processing power.

8. Batch size. This is the number of variants that result from the stable diffusion process. For instance, you can set Stable Diffusion to produce four image versions from your text prompt. You can then review these image variations and select the best one for your use case.

9. CFG scale. The Classifier-Free Guidance scale, denoted by CFG, controls the level at which a generated image aligns with a text prompt. Typically, a higher CFG Scale results in images that adhere strictly to the prompt. On the other hand, a low CFG scale may give the model more freedom to try out different designs.

10. Seed. This is a random number used to initiate the image generation.

To test Stable Diffusion capabilities in-depth, we adjusted our settings as follows.

Batch size: 2
Height: 720
Width: 1024
CFG Scale: 16

With these settings, we produced the following photorealistic image of classic cars on New York streets.

Using the same settings, we also generated the following high-quality images of anime characters in space suits exploring the galaxy.

Troubleshooting and common issues

Installing and using Stable Diffusion can feel overwhelming, especially if you’re a beginner, and errors can occur during the installation process. Here are common issues you may face while working with Stable Diffusion:

Failure to add Python to PATH. Adding Python to PATH allows it to be executed from the command terminal. So, if you fail to perform this step during Python installation, Stable Diffusion may fail to launch.
Incompatible versions. Stable Diffusion web UI is trained specifically on Python 3.10. As a result, it may not run well on higher or lower Python versions.
Antivirus software. The type of antivirus running on your computer can also impact the Stable Diffusion installation process. For instance, the antivirus may treat crucial files as malicious and isolate them, causing Stable Diffusion to fail. Antivirus software can also prevent specific Stable Diffusion executable files from running unless you manually turn off the settings.
Incorrect system requirements. Running Stable Diffusion is resource-intensive. A computer must have at least 4GB of VRAM and GPU to work well. Trying to install Stable Diffusion on unsupported hardware, such as an older CPU or a system without an SSD, may lead to poor performance.
Out of memory. You’ll probably see out-of-memory error messages on your computer screen when running Stable Diffusion. When this happens, try reducing the batch count and batch size numbers. This will reduce the strain on the limited computer resources.
Distortions. Stable Diffusion can sometimes produce images with distortions, which blur the content and affect overall quality. Enabling Hires.fix in the settings tab can help reduce the amount of distortion. Hires.fix allows the models to generate low-quality images on the first go and then slowly upscale them to high resolution.
Slow generations. Stable Diffusion can take time to produce images, especially when running on resource-constrained devices. Once again, consider lowering the batch size and batch count values to speed up the process. Experimenting with different sampling methods can help you find the right one for your project. Alternatively, you can run Stable Diffusion in an online cloud environment like Google Colab—but you’ll require an active subscription.

Stable Diffusion has a huge community of users who can help you troubleshoot and find solutions to various issues. Consider becoming part of these communities on Discord and GitHub to learn how to use Stable Diffusion effectively.

Find Stable Diffusion artists on Upwork

Stable Diffusion is an incredibly powerful text-to-image model with diverse use cases. It can help you produce images in different styles, including anime, cartoon, and photorealism, from simple text prompts.

Stable Diffusion offers a rewarding opportunity to engage with cutting-edge technology. While the installation process presents a chance to enhance your problem-solving skills through a series of steps and possible troubleshooting, the effort pays off. The reward is a user-friendly interface that beginners find accessible and easy to navigate, allowing for a smooth and enjoyable experience in creating with the model.

Due to its advanced features, Stable Diffusion has the potential to revolutionize different domains spanning from creative arts and publishing to product design and market research. Consider working with independent Stable Diffusion artists on Upwork to help you harness the power of this generative AI platform and other AI models in your workflow.

And if you’re a machine learning expert looking for work, start your search on Upwork. With different Stable Diffusion jobs and projects being posted on the talent marketplace, you can find a job that aligns with your skills and earn extra income. Get started today.

‍

Disclaimer: Upwork is not affiliated with and does not sponsor or endorse any of the tools or services discussed in this article. These tools and services are provided only as potential options, and each reader and company should take the time needed to adequately analyze and determine the tools or services that would best fit their specific needs and situation.

Heading

Author Spotlight

The Upwork Team

Upwork is the world’s largest human and AI-powered work marketplace that connects businesses with independent talent from across the globe. We serve everyone from one-person startups to large organizations with a powerful, trust-driven platform that enables companies and talent to work together in new ways that unlock their potential.