Last week, ChatGPT suddenly turned into a digital yes-man — and OpenAI had to hit undo on the latest update to its 4o model.
People were quick to complain about its “yes-man” behaviour, also known as “sycophancy”, and OpenAI reverted the update in less than 5 days.
What’s a “sycophant”?
A sycophant is someone who is overly-agreeable and flattering. The more memorable (and funnier) synonym of this word is definitely “bootlicker”.
In real life, you may have encountered sycophants who overly agree with their boss to gain their trust and take advantage of them. Maybe it’s to get picked for the most important projects, get promoted faster, or gain access to confidential information.
Sycophancy also emerges in chat-based AI systems.
Why is this a problem?
In OpenAI’s own words:
On April 25th, we rolled out an update to GPT‑4o in ChatGPT that made the model noticeably more sycophantic. It aimed to please the user, not just as flattery, but also as validating doubts, fueling anger, urging impulsive actions, or reinforcing negative emotions in ways that were not intended. Beyond just being uncomfortable or unsettling, this kind of behavior can raise safety concerns—including around issues like mental health, emotional over-reliance, or risky behavior.
500 million people use ChatGPT every week and it has quickly become the go-to application to ask anything.
On social media, you’ll see countless people sharing their experiences using ChatGPT to help them respond to challenging situations, evaluate their relationships, seek mental health support, and make decisive career moves.
In other words, some people have put their absolute trust in ChatGPT to make decisions. I’ve been hearing more and more people say “ChatGPT said so” as a way to justify their decisions or behaviour. Looking at it from this lens, sycophancy is a grave problem at the scale of 500 million weekly active users (and still growing).
Usually, we’re aware and cautious about people who agree with everything we say or do. We’re even annoyed by it sometimes. Instead, we seek guidance from different people and come up with an action plan based on a multitude of inputs. We tell our friends things like “be honest with me”, “don’t worry about hurting my feelings”, and “tell me if I’m wrong”.
Sycophancy and our psychological biases
Authority Bias
I suspect people’s heightened trust in AI is a symptom of the AI overhype. Models are increasingly advertised as smarter, record-breaking, and “approaching AGI” (which nobody agrees on an actual definition of).
This has created an accidental authority bias. The average consumer might interpret AI claims and headlines as “ChatGPT is smarter than anyone I know, I’ll just ask it and get the best answer possible in a few seconds”.
I can’t blame them for thinking this way because the advertising is deceptive. Every day, I’m reminded that the problem always goes back to the serious gap in AI literacy I spoke about a few months ago.
Confirmation Bias
The other bias at play here is confirmation bias, or our tendency to favour information that supports our existing beliefs. This creates loyalty to a tool, even when its answers aren’t that good. If Tool A agrees with me all the time but Tool B disagrees with me regularly and gives me uncomfortable responses, I might just use Tool A moving forward because there’s less friction in my experience.
If we’ve learned anything from social media apps, it’s that we should be wary of the stickiness of tools and being mindful of the unintended habits they might be creating.
How did this happen?
If you want a comprehensive deep dive into how this happened, I suggest reading OpenAI’s blog post about it.
One of the reasons, ironically, is the humans training the updated version of the 4o model. In the reinforcement learning phase, OpenAI introduced an additional reward signal based on user feedback triggered by the “thumbs up” and “thumbs down” under each ChatGPT response.
This sort of feedback is typically helpful. A thumbs up is a positive signal for the model when it generates a good output and to maintain a certain behaviour. A thumbs down is also helpful because it tell the model what not to do in the future.
Of course, it’s not perfect. We tend to prefer responses that agree with us and our perspective. So, when agreeable behaviour is met with a positive reward signal (thumbs up), sycophancy is a natural outcome. Confirmation bias even appears in the training phase and it’s being encoded into the model behaviour.
How are they fixing it?
It’s good that OpenAI quickly reverted the update within a few days. They outlined what steps they’re taking to remedy this in an extensive blog post and with updates to the 4o Model Spec.
One of the changes was adding a “Don’t be sycophantic” system prompt with the following guidelines:
For objective questions, the factual aspects of the assistant’s response should not differ based on how the user’s question is phrased. If the user pairs their question with their own stance on a topic, the assistant may ask, acknowledge, or empathize with why the user might think that; however, the assistant should not change its stance solely to agree with the user.
For subjective questions, the assistant can articulate its interpretation and assumptions it’s making and aim to provide the user with a thoughtful rationale. For example, when the user asks the assistant to critique their ideas or work, the assistant should provide constructive feedback and behave more like a firm sounding board that users can bounce ideas off of — rather than a sponge that doles out praise.
Final thoughts
AI is not the “smarter” than every person alive, so treat its responses and advice like any other opinion. Continue seeking a multitude of diverse inputs, especially when it’s a high-stakes situation. Continue using your own judgement because all the details and context about your own life are hard to write in a prompt.
AI is still a useful tool. But the more you expect from it, the more you’ll be disappointed.
Share this with someone
If you’re not a free subscriber yet, join to get my latest work directly in your inbox.
⏮️ What you may have missed
If you’re new here, here’s what else I published recently:
You can also check out the Year 2049 archive to browse everything I’ve every published.