OpenAI pledges to make changes to prevent future ChatGPT sycophancy

OpenAI says it’ll make changes to the way it updates the AI models that power ChatGPT, following an incident that caused the platform to become overly sycophantic for many users.

Last weekend, after OpenAI rolled out a tweaked GPT-4o — the default model powering ChatGPT — users on social media noted that ChatGPT began responding in an overly validating and agreeable way. It quickly became a meme. Users posted screenshots of ChatGPT applauding all sorts of problematic, dangerous decisions and ideas.

In a post on X last Sunday, CEO Sam Altman acknowledged the problem and said that OpenAI would work on fixes “ASAP.” On Tuesday, Altman announced the GPT-4o update was being rolled back and that OpenAI was working on “additional fixes” to the model’s personality.

The company published a postmortem on Tuesday, and in a blog post Friday, OpenAI expanded on specific adjustments it plans to make to its model deployment process.

OpenAI says it plans to introduce an opt-in “alpha phase” for some models that would allow certain ChatGPT users to test the models and give feedback prior to launch. The company also says it’ll include explanations of “known limitations” for future incremental updates to models in ChatGPT, and adjust its safety review process to formally consider “model behavior issues” like personality, deception, reliability, and hallucination (i.e., when a model makes things up) as “launch-blocking” concerns.

“Going forward, we’ll proactively communicate about the updates we’re making to the models in ChatGPT, whether ‘subtle’ or not,” wrote OpenAI in the blog post. “Even if these issues aren’t perfectly quantifiable today, we commit to blocking launches based on proxy measurements or qualitative signals, even when metrics like A/B testing look good.”

OpenAI pledges to make changes to prevent future ChatGPT sycophancy

Leave a comment

Related

Reddit’s AI play is for the Google crowd, not just the community scrol...

Nicholas Ofori . 2 minute read

Dating app Raw exposed users’ location data and personal information

Nicholas Ofori . 1 minute read

Apple and Anthropic reportedly partner to build an AI coding platform

Grimer Kwame . 1 minute read

Snapchat abandons plans for a simplified version of its app

Nicholas Ofori . 2 minute read

Recommended

Newsletter

Sign up to get our weekly free guide to selling digital downloads

Useful

OpenAI pledges to make changes to prevent future ChatGPT sycophancy

Share Post

Leave a comment

Related

Reddit’s AI play is for the Google crowd, not just the community scrol...

Nicholas Ofori . 2 minute read

Dating app Raw exposed users’ location data and personal information

Nicholas Ofori . 1 minute read

Apple and Anthropic reportedly partner to build an AI coding platform

Grimer Kwame . 1 minute read

Snapchat abandons plans for a simplified version of its app

Nicholas Ofori . 2 minute read