Since OpenAI released its new image generation capabilities in their 4o model, the internet has been flooded with Ghibli-style images. I’ve tried it out myself and the results were surprisingly good (and sometimes mind-blowing if I’m being honest). I had the most fun putting myself into the Severance poster.
Apple, please don’t sue me?
If you’ve been subscribed to this newsletter or following any of my accounts, you know I spend countless hours hand-drawing my visual AI explainers. Seeing the new image generation capabilities got me wondering: can anyone easily do what I do now?
When I started writing 4 years ago, drawing fun comics and visuals was my “differentiator” to talk about AI and emerging tech. I told myself: “Everyone writes articles about AI, but nobody spends hours drawing comics about it to explain it simply!”.
Even when the first iterations of AI image generators came out, I wasn’t concerned. Clearly the images were inferior. They didn’t always look great. The text always looked wonky. The composition was off. It was hard to create consistent characters. It looked soulless.
But is it different now?
An experiment: Can AI replace my creative work?
I decided to challenge AI to recreate one of my explainers.
To get a realistic sense of its capabilities, I offered it as little guidance and prompting as possible to see what it would come up with on its own. This is intentional and you’ll understand why at the end.
How does a 100% human-generated video explainer compare to a 100% AI-generated one? Let’s find out.
Recreating my illustrated explainer video with AI
I picked one of the more challenging explainers I tried to recreate: How AI generates images (and because I wanted to make AI generate images that explained how it generates images).
Using ChatGPT-4o, I started with the following prompt:
Starting Prompt:
I want you to create an illustrated 60-second video explainer on how AI generates images. The audience is non-technical and you should make it easy for anyone to understand it. Start by generating the script then come up with your best idea for illustrated visuals to show throughout the video for each part of the script. Ignore everything you know about my past videos or ideas I've shared.
As instructed, ChatGPT came up with a script and ideas for images. I didn’t prompt it again or make any adjustments. I generated the images based on the prompts it provided, and did a voiceover of the script it wrote with no changes.
This was the result:
The AI-generated version of the video took me less than 1 hour in total to make:
Research: I didn’t supply any research or facts, and relied on the model’s knowledge to explain how image generation works
Writing: The model wrote the script and ideas for visuals on its own with no input or iteration from me
Image Generation: The model generated 7 images based on the prompts it wrote and in the style it chose
If you’re interested, you can read the full conversation between me and ChatGPT.
Comparison
To compare, here’s my original video:
The original video I made took about 8 hours total:
Research (2 hours): I dug deep into the technical explanations of diffusion models so I could create an accurate representation and explanation of how they worked
Writing (1.5 hours): I worked through multiple versions of my script and got stuck thinking of the right metaphors to use before I landed on the sculptor one (1.5 hours)
Storyboarding (1 hour): I storyboarded and sketched out my entire video using post-it’s. This helps me create a coherent story that illustrates what I wrote before jumping straight into my final visuals.
Drawing (2.5 hours): I turn each post-it I drew into a final illustration using my iPad.
Editing + Recording (1 hour): I record my introduction and voiceover (multiple times), then stitch together my illustrations and voice to make the final video.
Reflections
I’m not going to make big claims or extrapolate AI’s impact on creative work and jobs based on one tiny experiment. There are already way too many conflicting opinions about this and I’d rather not be another source of noise, stress, and anxiety.
Instead, I’d like to share how I see it impacting my own work and creative process. Maybe it will help clarify what it means for you.
#1: The battle of productivity and creativity
The simplistic conclusion would be to say “AI took 1 hour and you took 8 hours to make the same thing, so AI is obviously better”.
But… is it the same thing? Is it better?
The time it took doesn’t matter. What matters is the final product. My goal is to make a video that people are going to watch, enjoy, remember, and learn from.
When you watch a video or movie, do you ever think about the hours it took to make it? No.
You just remember how it made you feel. It made you laugh, cry, relax, scream, cheer, or clap.
The AI narrative has centered around making all of us more productive and do everything faster. Coming from an Industrial Engineering background, I believe in and support the idea of optimization where it makes sense: factories, hospitals, supply chains, and public transit, to name a few.
Thankfully, I’m equally right-brained as I am left-brained. When it comes to my creative work, my best ideas have come from slowness and boredom which we can easily skip over when we have AI at our fingertips and always get an answer.
The controversial advice I’m giving myself is not to be “AI-first”, but “human-first and AI-augmented” (I need to come up with a sexier way to say this so I can post this on Linkedin and be recognized as a Top Voice!).
#2: Prompt engineering and human expertise
I’m sure many prompt engineering pros rolled their eyes at the prompt I gave ChatGPT.
“But you didn’t give it all the info it needs to give you the most optimal output!”
They’re right. I barely gave it any information on what to do exactly. I could’ve provided:
The artistic style
The facts about how AI image generation works
The story outline
The script
The ideas for each visual
Each one of these variables requires deep expertise in skills like art, writing, and visual storytelling. Someone with none of these abilities doesn’t have the language to communicate what a good output should look like. I know I could’ve improved the AI outputs, only because I’ve practiced and made hundreds of videos. I’ve drawn and redrawn hundreds of illustrations. I’ve also read countless books from Scott McCloud and Will Eisner on comics and visual storytelling. Good prompt engineering is just good ol’ human expertise.
If you give an artist and a non-artist access to an AI image generator, who do you think will create the “better” art?
Another aspect of this is the facts included in the script. I already understood how AI image generation worked so I could easily catch any hallucinations or other incorrect statements. If it got anything wrong, a non-expert would’ve missed it and an expert would’ve spent more time fixing it.
This is why I strongly believe in the increasing value of human expertise. When anyone can make anything with AI, the experts will stand out and be even more valuable.
AI is best used as a compliment to human expertise, not a replacement for it. When we all have access to the same models, we become the differentiators.
#3: AI raises the floor, not the ceiling
To be completely honest, I was shockingly impressed by how good the character consistency and text rendering was within images. It’s not perfect, but it’s leagues above what I’ve seen before.
Making good-enough images has become a lot easier and practically free, so we’ll have much higher quality images everywhere around us. I hope this makes corporate presentations and trainings a lot less dry than ones with filled with stock images. I hope it helps overworked teachers add life to their lessons to help kids pay more attention and remember what they learn. I hope people use it for their vision boards to imagine their dream lives, for their friends and pets on special occasions, and for memes to make us laugh til we cry.
But for artists?
The ceiling for originality and creativity is still sky-high. My creative work isn’t just about making illustrations. It’s about the ideas and stories I come up with and how I bring them to life.
Crafting my own ideas and stories will still make me stand out in an ocean of AI-generated content. This is why my drawing style is extremely cartoonish and imperfect: I care more about the stories and metaphors I share than making “nice” images.
Just think of this: with all the amazing image generation capabilities, 99% of people just uploaded an image of themselves to turn into a Ghibli portrait instead of bringing more original ideas and stories to life.
#4: Learn the new tools
As a creative, it pains me to know that AI image generators are trained on billions of copyrighted and stolen content. It’s a problem I’ve been thinking about for over two years and even wrote about a solution I had for it.
I don’t plan on incorporating AI-generated images or videos in my own work and will continue leaning into creating my own, mostly because I love drawing and making my illustrations. I’d rather spend less time answering emails or on other things that drain my energy.
But, I still plan on experimenting and familiarizing myself with these tools to stay aware of their capabilities, weaknesses, and opportunities for the future. This will even help me gain clarity on how AI impacts my full-time job as a UX Designer where I spend many hours designing mobile apps and websites.
The reminder I always give myself: I cannot defend myself against what I don’t understand.