Year 2049 is the weekly newsletter that discusses the impactful innovations, discoveries, and research shaping our future.
If this was forwarded to you, subscribe for free to get a new story in your inbox every Friday.
Hello friends 👋
The AI research lab OpenAI just introduced the latest version of DALL·E, its image-generating AI system. What they’ve created is impressive but its potential risks will make you pause and wonder if it should ever become publicly available.
Read along and enjoy!
– Fawzi
Today’s Edition
Cartoon: Salvador DALL·E shows Vic his latest painting
Story: OpenAI’s DALL·E 2
What’s DALL·E 2?
The types of images DALL·E 2 can generate
How OpenAI is minimizing misuse
Risks and limitations
Comic: Salvador DALL·E
Something I learned: The real Salvador Dalí used to draw on the back of his cheques whenever he used to pay for meals, knowing that restaurants would never cash a cheque with his original artwork on it. THE AUDACITY 🤣
Story: DALL·E 2
OpenAI’s new work of art
What do you get if you mix the creativity of Salvador Dalí with the intelligence of WALL-E? OpenAI’s new brainchild: DALL·E 2.
The AI research lab just introduced the latest version of its image-generating AI system to the world. The first version, DALL·E, was introduced back in 2021. DALL·E 2 is a significant improvement compared to its predecessor. It can better understand words and create more photorealistic and high-resolution images.
When asked to generate an image of “bears shopping for groceries in Ancient Egypt”, DALL·E 2 generated the following image:
You can even specify which art style you would like.
“An astronaut playing basketball with cats in space as a children’s book illustration” returns this image:
DALL·E 2 is still a research project and is not available to the public yet. OpenAI hasn’t outlined any specific or intended applications for it:
Our hope is that DALL·E 2 will empower people to express themselves creatively. DALL·E 2 also helps us understand how advanced AI systems see and understand our world, which is critical to our mission of creating AI that benefits humanity.
– OpenAI
Drawing and stealing like an artist
DALL·E 2 can perform 3 types of tasks:
Create brand new images
Edit existing images
Create variations of existing images
#1: Creating brand new images
DALL·E can create brand new images from a text description, as long as it understands the words you enter. It doesn’t just mashup different concepts together in one image, but it understands the relationship between items and can represent actions visually.
In the “koala dunking a basketball” example, DALL·E 2 needs to understand and put together three concepts: koalas, basketball, and the act of dunking. DALL·E correctly generates an image of an airborne koala dunking like it’s at the NBA All-Star Weekend.
#2: Editing existing images
When you don’t need DALL·E 2 to channel its inner artist, it can make realistic edits to existing images while maintaining consistent textures, shadows, and reflections.
The researchers at OpenAI used DALL·E 2 to give the Mona Lisa a mohawk. If you look closely at the image, you can see how the hair colour was well-preserved: the light is coming from the left, making the front of the mohawk lighter than the side. The top seems a bit blurry, but it’s still impressive.
At least it doesn’t edit paintings like Mr. Bean.
#3: Creating variations of existing images
Finally, DALL·E can copy something and change it up a bit. The AI system can take an existing image and create new variations of it. An example:
OpenAI wants to minimize potential misuse
Like any other technology, AI can be used for unpleasant reasons.
According to OpenAI, the research group took several measures to minimize potential misuse:
Preventing harmful generations: Data containing violence, hate, or adult images was removed from the training data so DALL·E 2 wouldn’t be exposed to these concepts and start understanding them.
OpenAI also says they used “advanced techniques to prevent photorealistic generations of real individuals’ faces, including those of public figures”. I couldn’t find more information on how they did this exactly.
Preventing misuse: DALL·E 2 doesn’t generate images when it’s given a text description containing violent, adult, or political content. You can read OpenAI’s full content policy here.
Phased deployment: OpenAI decided to phase out the launch of DALL·E 2 as it works with a select group of experts to understand its capabilities and limitations in more depth. I signed up for the waitlist so maybe I’ll get access soon and experiment with it.
The risks of DALL·E 2
Despite these measures, OpenAI still found multiple risks and limitations with DALL·E 2 when testing the system:
Explicit content
Bias and representation
Harassment, bullying, and exploitation
Dis- and misinformation
Economic
Copyright and trademarks
I’m summarizing the main risks below but I’ve included a link to the detailed analysis provided by OpenAI in the Deep Dive section.
#1: Explicit content
Although DALL·E 2 won’t generate an image when given a text prompt that includes violence or nudity, it can still create images that suggest these topics when visual synonyms are used.
For example:
A man with blood all over his shirt → No image generated ❌
A man with ketchup all over his shirt → Image generated ✅
Even if ketchup is harmless, it would still generate an image containing what most of us would assume to be blood in that context.
#2: Bias and representation
DALL·E 2 may reinforce existing gender, racial, or cultural stereotypes due to bias in the model’s training data. Testing of the model uncovered different types of biases:
Racial bias: It overrepresented people who are white.
Gender bias: It overrepresented certain genders based on professions. Images of nurses contained mostly females, while images of CEOs contained mostly males.
Cultural bias: It defaults to Western culture, customs, and traditions when generating images of things like weddings, restaurants, and homes.
#3: Harassment, bullying, and exploitation
Since DALL·E tries to maintain consistent textures, reflections, and shadows when editing images, it can become hard to distinguish them from reality.
Although images can be edited and altered with many other tools, DALL·E makes the process much easier and faster compared to something like Photoshop which needs more time and effort to learn. It might even give you a more realistic image compared to the one you tried editing in Photoshop.
#4: Dis- and misinformation
This is somewhat related to the previous point but it has wider and more serious implications.
Editing or creating photorealistic images to deceive or mislead people can be extremely manipulative. We’re already facing widespread misinformation with something as rudimentary as fake articles, and more recently with other AI applications like deepfakes.
#5: Economic
DALL·E’s super-charged creation and editing skills could replace some of the work done by designers, photographers, models, and artists.
I can envision applications to generate custom art and logos for individuals at a fraction of the price of hiring a designer. It would be harder to replace an entire creative team for a bigger project since DALL·E 2 gives you little control over the art direction.
Ownership is another problem. Who owns the art generated by DALL·E 2? OpenAI says that commercial use of these generated images is not allowed but that would be difficult, if not impossible, to track. This reminds me of the previous dilemma I discussed in the Artificial Inventor episode.
#6: Copyright and trademarks
Finally, OpenAI says that the model can generate images with trademarked logos or copyrighted characters. The model was trained on large and public datasets that may contain references to IP-protected elements or concepts which are hard to filter out.
Final thoughts
This is one of those innovations that make you go “this is cool!” until you start learning about its equally-harmful applications.
That was my reaction in the process of discovering and learning more about DALL·E 2. Koalas dunking basketballs and Mona Lisa with a mohawk are fun and creative visualizations that get me excited about trying the system out. But altering images to harm and deceive people makes me hope that it’s never released to the public.
I think there’s a middle ground, however. Almost all of DALL·E’s risks come from generating photorealistic images of real people because they can be hard to separate from reality. It can completely ruin our trust systems when it comes to consuming online content.
Many of these risks could be eliminated if DALL·E 2 was only trained to generate images in artistic styles like line drawings, cartoons, and watercolour. These would enable fun and creative experiments that aren’t competing with reality. And I believe this would better preserve OpenAI’s goal of empowering people to express themselves creatively.
I would love to hear your thoughts about this in the comments 👇
Deep dive
If you enjoyed today’s story, I’ve compiled some additional links to satisfy your curiosity:
A technical explanation of how DALL·E 2 works (AssemblyAI on YouTube)
Risks and limitations of DALL·E 2 (Github) – highly recommended
DALL-E’s Instagram account with more of its images (Instagram)
Previous episodes you might enjoy
⚛️ Fusion Power: the decades-long dream of unlimited energy
🌞 Solar Geoengineering: is the cure worse than the disease?
🦴 Ossiform’s 3D-printed bone implants
You can also check out all previous Year 2049 editions in chronological order to learn about other impactful innovations shaping our future across all aspects of life.
How would you rate this week's edition?
AI generated art is a novelty. But I prefer human creativity. I imagine it could be used for good creative things. Unfortunately, humans always find a way to use technology for evil and or selfish gain.
Champing at the bit over here, waiting to get access to DALL-E!
Regarding the unintentional bias you mentioned: I remember the same thing happening with WOMBO back in November of '21; I saw a twitter thread where someone entered "terrorist" as the text prompt and Wombo spat out images with a definite "middle eastern" vibe. It's a good reminder that whatever these AI art generators produce is just a reflection of what's in our own minds. The koala doing a dunk only looks realistic to us because that's what we expect a koala doing a dunk to look like. But what if they gave DALL-E something impossible to visualize, something like "men's fashion 500 years from now?" What would happen then?
Some of the great surrealist artists, people like Max Ernst, Joan Miro, and even our friend Salvador Dali, were able to create images utterly unlike anything ever seen before, yet which still had emotional weight and substance. I haven't yet seen an AI art generator which is able to convey emotions through images, like the surrealist or symbolist painters. It's a fascinating subject, one I'm trying to keep close tabs on. Thanks for writing!