Charlie Meyer's Blog

The GPT-5 Launch Was Concerning

Bs in Blueberry

There were screenshots of a classic LLM issue floating around Bluesky after the GPT-5 launch yesterday, and I asked GPT-5 myself to confirm.

Sam Altman touted GPT-5 as a “PhD level expert in your pocket”, but this PhD doubled down on incorrectly answering the oldest trick for LLMs in the book.

When GPT-4 launched, I (and many others) believed that GPT-5's launch would be the “AGI moment”. Cherry picking “bs in blueberry” as a failure of the model and declaring that AGI is never coming is stupid. But it leads to doubts. And there was something else in the keynote that was quite a bit more disturbing.

Custom Colors for ChatGPT Threads

“We’re now allowing you to customize the colors of your chats, with a couple of options exclusive to our paid subscribers”

They admitted that they were, and I am not lying about this, paywalling chat colors. If you had truly just launched a “PhD level expert in your pocket”, the colors of user chat bubbles is not what you would focus on. You would dedicate zero minutes of engineering effort to this task, and you would dedicate zero minutes of your most important keynote in years to this feature.

This is a feature that a company adds when they are out of ideas, and when they need to find any way possible to squeeze paid subscribers out of their (money losing) free user base.

Beyond this, the industry should be concerned about the rest of the presentation as well.

Cursor’s Big Problem

Cursor’s CEO Michael Truell was invited on stage to talk about GPT-5’s coding capabilities and demonstrated how GPT-5 could complete some tasks within Cursor’s IDE (nobody tested its PR to see if it was any good, by the way). Cursor is by far the most successful AI-powered company outside of big tech, OpenAI, and Anthropic. According to Techcrunch, Cursor has hit $500MM in ARR. This is impressive.

However, OpenAI dedicated the 27 minutes of the keynote immediately preceding his appearance to demonstrating coding tools available within ChatGPT. This is textbook “sherlocking”, where a platform incorporates features from popular platform developers into their main offering. Coding tools inside of ChatGPT represent a massive risk to Cursor, and they represent a massive risk to OpenAI’s API business, which relies on enterprise level deals with startups like Cursor.

Conclusion

OpenAI did not blow me away with their presentation yesterday. If anything, they confirmed my skepticism about the AI industry at large. If there were sparks of AGI in the presentation, I would have been ready to get back on the hype train. Rather than show sparks of AGI, the presentation showed sparks of a company that’s starting to wander aimlessly as model progress slows. Maybe we just need to wait for GPT-6.