Why you shouldn't trust AI detectors 🧐
+ OpenAI announces SearchGPT, Llama 3.1, Chelsea FC + AI, and The AI Summer
Hey friends 👋
I’ve been heads down over the past few weeks planning and working on some AI courses I want to launch over the next 6 months. I’m working on three courses:
What you need to know about AI (aiming for a September release)
Building a professional no-code website with AI
Creating custom AI assistants and automations
You can now sign up for the waitlist for early access.
Today’s topic is part of the “What you need to know about AI” course where I plan on explaining and debunking the most important AI terms, concepts, and myths.
– Fawzi
Today’s Edition
Main Story: Why you shouldn’t trust AI writing
Weekend reads: SearchGPT, Llama 3.1, Chelsea FC + AI, The AI Summer
Updates: Office Hours!
🧐 Why you can’t detect AI writing
One of the most entertaining parts of being an AI content creator is receiving emails from startups trying to build the next great AI tool.
A few months ago, I received two emails on the same day that were hilariously contradictory.
The first one claimed to have built a detection tool that could identify AI-generated text.
The second one claimed to have built a “text humanizer” that could bypass AI detection tools.
In fact, both were selling me an empty promise.
In January 2023, OpenAI announced that they had a built a classifier to identify AI-generate text. They retired the tool 6 months later due to low accuracy rates.
The classifier was doomed from the very beginning: OpenAI reported that it only identified AI-written text 26% of the time in their blog post. It also incorrectly labeled human-written text as “AI-generated” 9% of the time.
If you scour through OpenAI’s FAQs, you’ll find the following statement:
Do AI detectors work?
In short, no, not in our experience. Our research into detectors didn't show them to be reliable enough given that educators could be making judgments about students with potentially lasting consequences. While other developers have released detection tools, we cannot comment on their utility.
Additionally, ChatGPT has no “knowledge” of what content could be AI-generated. It will sometimes make up responses to questions like “did you write this [essay]?” or “could this have been written by AI?” These responses are random and have no basis in fact.
To elaborate on our research into the shortcomings of detectors, one of our key findings was that these tools sometimes suggest that human-written content was generated by AI.
When we at OpenAI tried to train an AI-generated content detector, we found that it labeled human-written text like Shakespeare and the Declaration of Independence as AI-generated.
There were also indications that it could disproportionately impact students who had learned or were learning English as a second language and students whose writing was particularly formulaic or concise.
Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.
Source: OpenAI FAQ
You’ll also find many angry users of these “AI detection tools” that classify their original writing as AI-generated.
The biggest problem is that teachers use these tools to maintain the academic standards and integrity of their students’ work. But if these tools have severely low accuracy rates, you can’t rely on them to tell you if someone wrote their essay on their own or just asked ChatGPT to write it for them.
If these tools aren’t consistently accurate and reliable, then what are they good for?
As part of debunking this myth, I want to show you how easy it is to bypass an AI detector.
I generated this short paragraph about the history of the Olympics using ChatGPT:
I pasted it directly into an AI detector, which correctly identified it as AI-generated.
Then, I told ChatGPT to rewrite it in a more “casual and natural” way and I changed the first sentence:
From: “The Olympics have a fascinating history that dates all the way back to ancient Greece.”
To: “The Olympics date all the way back to ancient Greece.”
And now, my text is classified as human-generated, but with a 39% probability of it being AI-generated.
AI writing detection remains a difficult (and maybe impossible) problem to solve because it’s easy to modify the voice and tone of your output to sound more human with basic prompting.
For now, the best way to detect AI writing is using your own judgement and knowledge of someone else’s speaking and writing style.
If you enjoyed today’s story, I plan on covering these topics in more detail in my free “What you need to know about AI” course. My main goal is to help you build your AI fluency and have a strong understanding of what AI can or can’t do.
📖 Weekend reads
🔎 OpenAI announces its search engine: SearchGPT
OpenAI’s teased its next great endeavour: its very own conversational search engine.
It’s only being tested with a small group of users for now, but you can sign up for their waitlist here.
Based on the screenshots, it looks similar to Perplexity where you can search via a conversational interface and get answers in natural language and engaging visuals with references to the original source.
The other benefit is that you can ask follow-up questions while maintaining the prior context, as opposed to making multiple and disjointed searches.
One interesting quote from OpenAI’s post:
In addition to launching the SearchGPT prototype, we are also launching a way for publishers to manage how they appear in SearchGPT, so publishers have more choices. Importantly, SearchGPT is about search and is separate from training OpenAI’s generative AI foundation models. Sites can be surfaced in search results even if they opt out of generative AI training.
🦙 Meta launches Llama 3.1 models
Meta released Llama 3.1, their most advanced open-source AI models in three different sizes (the largest with 405 billion parameters)
Features include a context length of 128K, multilingual support, and the flagship Llama 3.1 405B model.
Llama 3.1 rivals top closed-source models, like GPT-4 and Claude 3.5 Sonnet, in performance and capabilities.
You can run these models for free on your laptop (just a heads up that they’re huge), and you can download them from the Meta website or by using Ollama.
🌞 The AI Summer by Benedict Evans
I’m a big fan of Benedict’s writing. In this article, he talks about AI’s much-needed reality check: AI demos and apps have sparked a lot of interest and hype, but it hasn’t translated into any transformational impact yet. This isn’t to say that the hype is unjustified, but that it takes time for these technologies to pan out and become genuinely useful and adopted, rather than just using these tools casually and for fun. Reading this reinforces my belief that AI innovation and impact will come from the UX world because we have to start from a problem-first and human-first approach, rather than an AI-first approach. It’s the main reason why I want to help non-technical people build their AI fluency and knowledge.
⚽️ Chelsea FC’s personalized AI video for their fans
As a Manchester United fan, I’m finding it hard to praise Chelsea in my own newsletter. But I have to acknowledge that they did something cool and delightful for their fans. My friend Amin shared his Chelsea home kit preview video with me, and it’s honestly a delightful way of using AI for customer delight. The video features Chelsea legend, Eden Hazard, giving you a tour of the team’s training ground while also saying Amin’s name. It seems like they’re using Hazard’s AI cloned voice (and some good editing) to make it all look natural.
🤖 More from Year 2049
⭐️ I’m opening up weekly “office hour” slots for you to book and chat about AI, UX, or content. Book your slot here
🔮 The future is too exciting to keep to yourself
Share this post in your group chats with friends, family, and coworkers.
If a friend sent this to you, subscribe for free to receive practical insights, case studies, and resources to help you understand and embrace AI in your life and work.