My First Day With OpenAI's DALL·E 2

5 cheers

-- cheers

~12 min read

104--

This weekend I was lucky enough to get access to OpenAI's new DALL·E 2 system, which is able to generate realistic images from natural text inputs.

Rather than just share a bunch of random images, in this blog post, I'm going to follow my first day of having access, and show what I asked the AI to generate throughout the day. It was a busy day and I had a lot to do, so it was the perfect time to wait for natural inspiration, and the mobile web app works really well, so it was easy to punch in prompts at will. Also, keep in mind that this is a pretty simplistic look at the system - each image (or set of images) is generated by a single prompt, and I didn't get into any inpainting, image editing, or composing multiple prompts in an image. That's an investigation for another weekend.

First thing in the morning, I started with something simple while I was feeding our dog Wrangler, to see if I could generate a simple digital art of something silly and unique enough it would put the AI to the test.

"A german shepherd holding a stop sign in a high-visibility vest acting as a crossing guard for a school in a big city, digital art"

Three digital drawings of a german shepherd dog, sitting in a crosswalk, holding a stop sign in various poses

This one was the longest prompts I have attempted. I kept tuning my language to get exactly what I wanted. My initial prompt, "A german shepherd acting as a crossing guard", didn't provide anything as interesting as the images above. The rest of the prompts in this post are pretty short, but the system can take in whole paragraphs, such as for the use-case of auto-illustrating an entire book.

Next, I experimented a bit with different styles. It's super easy to control the style of the output, and you can use natural language to request art genres and styles, or even specific artists. I wanted to choose a simpler prompt, so off the top of my head, I went with "angry fruit".

"Angry Fruit" (left); "Angry Fruit, Pencil Drawing" (middle); "Angry Fruit, Cubism" (right)

After I played around a bit we left the house to grab some beer from a local brewery, and shop at a pop-up outdoor market. I took inspiration from some of the art we saw at the market and wanted to see if DALL·E could give a close approximation of what we saw at some of the booths. We still made some purchases from local vendors, but it was a fun experiment to see if I could control the platform into giving me something similar to what I saw.

"A pug dressed as an astronaut on a skateboard, digital art"

Three photos of a pug in an astronaut suit and helmet, riding on a skateboard, in various scenes with starts and planets in the background

I enjoy the small misunderstanding in that the third output has the skateboard upside-down, which I suppose in space would be totally acceptable.

Next up was brunch at an asian-fusion cafe, which included a lot of sushi, among other things. I started thinking here and came up with a pretty on-theme prompt.

"Sushi rolls that are scared to be eaten, digital art"

These cracked me up, I love how DALL·E really nailed the expressions while it still being obvious these are sushi rolls.

Next up on the itinerary was doing some shopping, and while there I started playing with some ideas for unique scenarios to generate images from. Initially, I came up with the idea of someone shopping for clothes amidst a forest fire because that would obviously be pretty ridiculous, but I didn't get the prompt quite right the first time. Instead, I found myself with a lovely fall day in the forest.

"Shopping for clothes in a fiery forest"

I tuned the prompt a little, and got closer to what I was aiming for:

"Shopping for clothes in a fire; digital art"

Three photos of racks of clothes surrounded by flames

You'll note that I frequently add the addendum "digital art". This is a good way to get the system to generate an image that's very close to the prompt, rather than try to compose a photorealistic output (which it can totally do - more on that later). For certain prompts, especially the more outlandish, this provides improved results.

Finally home, I wanted to play around with different genres of art more, and also start getting into the works of specific artists and seeing how they could be replicated. This is what blew me away the most I think, due to how perfect some of the results were. I was particularly impressed by the results for Salvador Dalí and the Ukiyo-e style. For these next comparisons, all 3 will use the same subject prompt, followed by either a genre of art from somewhere in the world or a particular artist.

San Francisco Golden Gate Bridge - Ukiyo-e (left); Claude Monet (middle); Salvador Dali (right)

Three photos of the golden gate bridge from varying views, in their respective art styles

Space Needle - Futurism (left); Vincent Van Goh (middle); Rembrandt (right)

Three photos of the Seattle Space Needle from varying perspectives, all in different art styles

Lisbon Tram - Impressionism (left); Leonardo Da Vinci (middle); Banksy (right)

Three photos of the iconic yellow trams from the streets of Lisbon, in varying art styles

Toying with these different styles and artists was a lot of fun, and gives a window into what's really possible here. It's also fun to try to infer what DALL·E has been trained on. I was particularly pleased to see that the style of Leonardo Da Vinci seems to have been drawn more from his engineering schematics, versus, say, the Mona Lisa.

3D Renderings

I didn't play a lot with 3D renderings however, DALL·E is pretty good at these too! Here are 3 fun ones I came up with. See if you can guess the prompts, and I'll include them below.

From right to left: a rendering of a large yellow car with a snake driving it wearing a tophat, a rendering of an airplane flying with nonsensical match symbols in the sky behind it, and a clownfish with stars in the background looking down at the surface of a planet

From left to right: "A 3D rendering of a friendly snake driving Jay Gatsby's car", "A 3D rendering of an airplane skywriting math equations", "A 3D rendering of a clown fish in space"

Photorealism

I also wanted to try out some photorealistic renderings. These weren't as fun to play with because I really like the results from digital art, but they were impressive nonetheless.

From left to right: a small aircraft-shaped pile of vegetables, comprised of broccoli, cauliflower, and tomatoes, a photo of 3 people and a dog walking on the surface of the moon with the earth in the background, and a photo of a small octopus near the wall of a swimming pool

From left to right: "A photo of an airplane made of vegetables on a runway at an airport", "A photo of a family walking their dog on the surface of the moon with earth in the background", "A photo of a pool with an octopus swimming in it"

My Favorite Results

My favorite results so far were generated just before I published this, so I had to include them here at the end. I tried requesting images "in the style of an architectural blueprint" and they are striking.

From left to right, the empire state building with markings and lines resembling a blueprint, the golden gate bridge with similar blueprint-like markings, and a large twin-engine aircraft with nonsensical engineering schematics around it

From left to right, The Empire State Building, The Golden Gate Bridge, and A Boeing 737

That wraps up my first day playing around with DALL·E 2! I'm going to have a lot more fun with this as time goes on, and you'll probably see more and more images generated using the system show up here on my blog. I've also been sharing some of my favorites over on my Twitter, so head over there to see some more examples, or send me any suggestions you want to see!