Upskill/Reskill

Dec 12, 2024

Display AI-Generated Images in a Jupyter Notebook

Teri Eyenike

AI and its associated technologies, such as OpenAI, can make many processes effortless. With the right tools, you can transform thoughts into creative ideas, by turning text into generated images and storing them in the cloud using Cloudinary, a digital media management tool.

OpenAI’s high-intelligence image API makes displaying an AI-generated image possible. The API provides ways to generate original images from scratch, edit an existing image based on a text prompt and create variations of an image. The model, DALL-E, is a neural network trained to create images from text descriptions. (Fun fact: The name DALL-E originated from combining the names of artist Salvatore Dali and the character Eve from the movie “WALL-E.”)

From content creation to marketing, advertising and design, there are lots of commercial and personal use cases for working with generated images. By using the OpenAI API, developers can create helpful text-to-image applications for users with the image generation endpoint.

In this guide, I’ll provide a detailed walkthrough on how to build an efficient image-generation app that is dynamic based on user input and displays the image output in a Jupyter Notebook.

What Is Jupyter Notebook?

Jupyter Notebook is a top choice for Python users working in fields like machine learning, data science and data visualization. It’s a web tool where you can create and share files with live Python code, equations, visuals and text. These files, called notebooks, mix Python code with rich text elements like paragraphs, pictures and tables.

What You’ll Need:

You need to do the following setup:

Install Python on your machine
Sign up for a free Cloudinary account
OpenAI API key. Register for an account
Install Jupyter using the Python package manager pip

Setting Up the Project

For this project, create a folder called openai_proj and install these libraries.

pip3 install openai python-dotenv cloudinary ipython jupyter

Next, store your secret key in the environment variable file.

Setting Up Environment Variables

Create a new file in your project directory called .env and add your OpenAI API key and Cloudinary secrets as follows:

.env
OPENAI_API_KEY=your_openai_api_key
CLOUDINARY_CLOUD_NAME=your_cloudinary_cloud_name
CLOUDINARY_API_KEY=your_cloudinary_api_key
CLOUDINARY_API_SECRET=your_cloudinary_api_secret

‍To access your credential values, go to your OpenAI and Cloudinarydashboard.

Creating the App

In your project directory terminal, run this command: jupyter notebook to start the development environment on http://localhost:8888.

Once in the environment, create a new notebook called dalle by clicking the New menu dropdown button.

OpenAI API Initialization

This script will securely load the API key from the .env file.

import os
from dotenv import load_dotenv
load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")

The purpose of the os.getenv function is to read the OPENAI_API_KEYsecret key value and set it up for use in the application.

Next, let’s initialize an instance of the OpenAI client by importing the OpenAI class from the openai module.

from openai import OpenAI
client = OpenAI()

The OpenAI API is not free. Check the pricing page to determine the cost if you intend to use it and build your product. If you are a new user, OpenAI gives you free credits to use within the first three months.

Cloudinary Configuration

Cloudinary is a cloud-based tool that provides an image and video API for storing, transforming, optimizing and delivering all your media assets with easy-to-use APIs, widgets or a user interface.

Let’s import the Cloudinary libraries.

import cloudinary
from cloudinary import CloudinaryImage
import cloudinary.uploader
import cloudinary.api

Set the Configuration Parameters

The values set for the configuration will read from the .env for your Cloudinary secrets.

config = cloudinary.config(
    cloud_name = os.getenv('CLOUDINARY_CLOUD_NAME'),
    api_key = os.getenv('CLOUDINARY_API_KEY'),
    api_secret = os.getenv('CLOUDINARY_API_SECRET'),
    secure=True
)

‍

Generating Original Images Using DALL-E 3

When generating an image, we will allow the user to enter their desired prompt using the Python input function. If they do not enter a prompt, the provided prompt will display an image when the user presses the enter key on the blank input.

from IPython.display import display, Image
import requests

def generate_image(prompt=None):
    if not prompt:
        prompt = 'A magical forest with glowing trees.'
    try:
        response = client.images.generate(
            model="dall-e-3",
            prompt=prompt,
            size="1024x1024",
            quality="standard",
            n=1,
        )
        image_url = response.data[0].url
        upload_response = cloudinary.uploader.upload(image_url, unique_filename = False, overwrite=True)
        
        srcURL = CloudinaryImage(upload_response['public_id']).build_url(width = 500, height = 500)
        
        image_data = requests.get(srcURL).content
        display(Image(image_data))
        
    except Exception as e:
        print(f"An error occurred: {e}")


user_prompt = input('Enter a prompt to create an image (press Enter to use the default): ')
generate_image(user_prompt)

‍

The imports in the code above will display the image visually using the URL from the stored Cloudinary AI-generated image instead of showing only the URL of the image. The requests library makes an HTTP request.

Within the generate_image function code block, it accepts a prompt that conditionally accepts user input. It uses the image generation endpoint to create an original image given a text prompt in the variable response.

The property n = 1 instructs the model to generate only one image at a time.

Learn more about the other two parameters the cloudinary.uploader.uploadfunction accepts, which takes the image_url from DALL-E’s generated image model.

Finally, we set the output image to a specified width in the srcURL variable that produces the Cloudinary image URL.

‍

For the complete source code for the project, use this gist or this notebookin Google Colab.

Conclusion

Feeling inspired already? The OpenAI API has many built-in features that allow you to expand this project.

There are many use cases, and this tutorial showcased one way to generate a custom and personalized image with words. Also, Cloudinary gave it a finishing touch so that you can relive the memory of creating something extraordinary and store the image in a secure location in the cloud.

‍

About The Author: Teri Eyenike

Teri Eyenike is a software engineer and a member of the Andela talent network, a private global marketplace for digital talent. With more than five years of experience focused on creating usable web interfaces and other applications using modern technologies, Teri recently built a LinkTree replica app with Django that allows you to create, save and display your user profile with all your favorite links. In addition to software development, Teri is a technical writer with extensive frontend and backend development knowledge, and acts as a community manager on Discord.