Lumiera
Posts
🔆 GPT-5: Hype, Rollout, & Backlash

🔆 GPT-5: Hype, Rollout, & Backlash

OpenAI's GPT-5, Team Lumiera featured on popular podcast, and the FT report that made the rounds this week.

Allegra Guinan & Emma G Moller
August 14, 2025

🗞️ Issue 83 // ⏱️ Read Time: 5 min

In this week's newsletter

What we’re talking about: OpenAI’s highly publicized launch of GPT-5, promised as smarter, safer, and “PhD-level”, generated excitement, but its debut came with unexpected controversy and important lessons for technical leaders.

How it’s relevant: The rollout inadvertently disrupted user routines, exposed a deep emotional bond between people and their AI tools, and highlighted the risks of prioritizing technical upgrades over continuity and user experience.

Why it matters: Abrupt technology upgrades, no matter how advanced, can disrupt workflows, erode user trust, and create unintended emotional fallout when real-world adoption and human experience are overlooked.

Hello 👋

This week, we’re unpacking the story behind GPT-5. Despite impressive technical achievements - 94.6% accuracy on advanced mathematics problems and 74.9% on coding benchmarks - the launch became a case study in how disconnected metrics can be from user experience when human factors are overlooked. While the model promised "PhD-level" capabilities, its rollout revealed critical gaps in how we manage AI transitions when users have developed trust and dependency on existing systems. For leaders and teams, it’s a case study in how tech rollouts can go wrong when we ignore real-world usage and the human factor.

Latest Not Greatest

On paper, the August GPT-5 launch from OpenAI was positioned as a leap ahead: improved reasoning, fewer hallucinations, and advanced coding skills. But OpenAI’s decision to abruptly replace GPT-4o without warning triggered a user revolt that forced the company to restore the previous model within 24 hours.

GPT-5 ranks high in the Long Context Reasoning Benchmark, showing that the model matters, but how long it thinks is equally important

GPT-5 implemented a unified system architecture that consolidates multiple models behind a single interface. Rather than letting users choose between discrete models, the system employs an automatic router (autoswitcher) that selects computational approaches based on query analysis. While this approach offers potential efficiency gains, directing simple queries to faster models while reserving expensive reasoning for complex tasks, it fundamentally changed how users interact with the system.

The GPT-5 reaction stemmed from a misreading of user psychology rather than a technical failure, and the response translated immediately into business impact. Reddit threads documenting user complaints received 4,600+ upvotes and 1,700+ comments.

The level of sycophancy, meaning how much an LLM agrees with the user,
was reduced in GPT 5 in order to make the model less manipulative and more truthful.

Research reveals why these attachments formed: OpenAI had previously reduced "sycophantic behavior" to make models more truthful, but users specifically requested the return of more supportive responses, being used to having ChatGPT being their very own Yes Man. As Altman revealed, some users had "never received genuine encouragement from anyone in their lives before", making ChatGPT's supportive tone a crucial source of emotional validation.

This dependency creates a complex ethical challenge, something we previously touched upon when we wrote about AI companions and human behaviour. While supportive AI can provide genuine mental health benefits, it also creates vulnerability when systems change without warning. The psychological impact demonstrates that responsible AI development must consider emotional well-being as seriously as technical performance.

This revealed that emotional attachment to AI models creates genuine business risk for the companies behind them. Users canceled subscriptions after losing what they described as a “trusted friend,” which highlights a shift in how organizations must evaluate AI deployments.

The Lack of Predictability and Its Business Impact

Imagine you had built your latest AI agent or enterprise product on OpenAI’s older models, leveraging 4o for creativity and o3-Pro for research. Overnight, you would have faced unanticipated workflow disruptions and had to scramble for alternatives or wait for OpenAI to revert its decision.

Security testing also revealed deeper problems beyond personality changes. Two prominent AI security firms, SPLX and NeuralTrust, found GPT-5 “nearly unusable for enterprise out of the box.” Their results showed that GPT-4o remains the most robust model under red teaming.

The technical issues compounded user frustration. GPT-5's broken "router" system randomly switched between model capabilities mid-conversation, making business workflows unpredictable. Users reported that the system sometimes delivered cutting-edge AI responses and sometimes inferior results within the same conversation thread, undermining reliability for professional use cases.

❝

If you are building AI products or implementing AI in your organization, what can you as a leader do better after learning about this launch?

The Lumiera Question of the Week

Higher User Agency = Better Transitions

OpenAI's automatic routing system made decisions about which model to use without user awareness or control, creating an opaque experience where users couldn't understand or predict their AI interactions. Perhaps most concerning from a responsible AI perspective was the lack of user agency in the transition. The automatic routing system made critical decisions without user awareness:

Users couldn't predict which computational engine would handle their queries
No mechanism existed to understand why responses varied in quality or style
The system learned from user behavior without explicit consent or explanation

Reddit thread: “GPT5 is Horrible”

OpenAI’s Strategic Bet

So, why did OpenAI do this? Why not keep old models even when new ones are “better”?

Shift to Product: OpenAI implemented an automatic system that decides which AI capability to use for each query. This is classic product thinking: Hide the technical complexity behind a seamless user experience.
Cost and Infrastructure Pressure: Running multiple models simultaneously is expensive - each model requires dedicated server capacity, and OpenAI's API traffic doubled in the 24 hours after GPT-5's launch, creating massive infrastructure strain.
The "iPhone Strategy": When Apple releases a new iPhone, they don't keep manufacturing the old one at the same price point. The idea was to force users onto the "better" model to drive adoption and justify the development investment.
Technical Complexity of Multi-Model Support: Maintaining multiple models creates complex engineering challenges around routing, version control, and user experience.

Course Correction within 24 hours

The crisis immediately benefited competitors. Prediction markets showed users expecting Google to lead by the month's end. Claude and other alternatives gained attention. In AI's winner-take-all dynamics, deployment failures can rapidly shift market position.

So naturally, OpenAI moved fast:

Reinstated legacy models (GPT-4o, o3, o4-mini, etc.) in the model picker for Plus/Pro users.
Said they’re making some interventions to how the decision boundary works that should help users get the right model more often.
Claimed they’ll make it more transparent about which model is answering a given query.
Will change the UI to make it easier to manually trigger thinking.
Altman vowed to preserve “warmth” as well as technical rigor, with user customization on the horizon

Current enterprise AI evaluation methods focus heavily on technical benchmarks, cost analysis, and risk assessment while largely ignoring user emotional response and attachment factors.

Action points: Moving forward

Ok, so enough with the analysis. Let’s move on and turn this into something real: Action. Execution. Call it whatever you want, but what we’re focusing on here is: What could have been done differently, and what can company leaders like yourself learn from this, all in 5 simple steps.

Preserve User Agency: Technical sophistication should enhance rather than replace user choice. Automated systems must provide transparency about their decisions and offer manual overrides when users prefer different approaches.
Manage Emotional Dependencies Ethically: Organizations must acknowledge that users form meaningful relationships with AI systems and design transitions that protect emotional well-being. This includes gradual changes, clear communication about personality modifications, and resources for users who experience distress.
High Quality Communication: AI system updates require the same careful communication and transition planning as any significant organizational change. Users deserve advance notice, clear explanations of impacts, and opportunities to provide feedback before implementation.
Holistic Success Metrics: Evaluation frameworks must extend beyond technical benchmarks to include user satisfaction, emotional impact, workflow integration, and relationship continuity. Technical superiority without user acceptance represents deployment failure, not success.
Respect Vulnerability: When AI systems serve users dealing with mental health challenges, social isolation, or other vulnerabilities, changes must be implemented with extra care and support resources.

OpenAI's GPT-5 launch demonstrates how even groundbreaking technical achievements can fail when deployment prioritizes operational efficiency over user autonomy, transparency, and established relationships. The lesson isn't "don't upgrade AI systems." It's that AI product management requires fundamentally different approaches. Success depends on managing relationships, not just optimizing benchmarks.

Big tech news of the week…

🍎 Our very own Allegra Guinan, CTO & Co-Founder of Lumiera, was on the Practical AI Podcast discussing confident and strategic AI leadership. If you're not familiar with the pod, they've been running for 7 years and are ranked among the top 0.5% of all podcasts globally. Subscribe to the Lumiera Podcast Playlist here!

🌍 The Financial Times reports a marked decline in conscientiousness, especially among people in their 20s and 30s. This is linked to rising distractibility, reduced follow-through and tenacity, alongside worsening trends in neuroticism and a broad drop in extroversion that accelerated during and after the pandemic. Because conscientiousness strongly predicts career success, relationship stability and even longevity, the piece argues this shift has serious societal and workplace implications, while noting traits are malleable and can be strengthened through environment and habits.

⚖️ An audit of a major AI training dataset called DataComp CommonPool showed that it contains LOTS of personally identifiable information (e.g. passports, credit cards, home addresses) - all of it in just 0.1% of the image–text dataset. The dataset has been widely used to train models, something that amplifies downstream privacy risks.

Until next time.
On behalf of Team Lumiera

Emma - Business Strategist
Allegra - Data Specialist

Lumiera has gathered the brightest people from the technology and policy sectors to give you top-quality advice so you can navigate the new AI Era.

Follow the carefully curated Lumiera podcast playlist to stay informed and challenged on all things AI.

Disclaimer: Lumiera is not a registered investment, legal, or tax advisor, or a broker/dealer. All investment/financial opinions expressed by Lumiera and its authors are for informational purposes only, and do not constitute or imply an endorsement of any third party's products or services. Information was obtained from third-party sources, which we believe to be reliable but not guaranteed for accuracy or completeness.