Having barely recovered from the shock of ChatGPT 4o, whose massive upgrades to ‘free AI’ drastically changed how companies should think about AI, we were hit with another major change this week.
Anthropic, a leading AI research company, unveiled Claude 3.5 Sonnet, a new version of its version of ChatGPT that has been steadily climbing the ranks.
Shockingly, early benchmarks have this new model ahead of ChatGPT 4o in several areas.
As you harness the power of AI in your organization, Anthropic's latest developments deserve your attention, so let’s dive in.
What is Claude?
We’ve been hearing so much about ChatGPT that it’s easy to forget that others exist, but they do.
Google Gemini (#2 in our Top 100), Claude (#6), and Microsoft Copilot (#13) all have significant user bases.
Claude stands out, especially as the only player besides ChatGPT that is not the AI arms of existing platforms with massive audiences from which to send traffic (Google and Microsoft.)
Now, no one, not even Gemini, comes close to ChatGPT, which represents over 80% of these platforms' traffic and 50% of all Top 100 AI tools. But if Altavista taught us anything, you can never be too sure that one platform will always dominate.
For a two-sentence history lesson, Claude was created by Anthropic, a San Francisco-based AI startup founded in 2021 by former OpenAI researchers. The company attracted $6 billion in funding from Amazon and Google and launched its first Claude model in March 2023.
It’s focused on safer AI, for example, by trying to solve the ‘black box problem.' The corporate entity is even aligned with societal benefits rather than just profits, a dilemma that tortured OpenAI and may have led to the firing of CEO Sam Altman.
Why Do We Care About Claude 3.5 Sonnet?
While some loved Claude, especially for its more creative writing style, the world did not care as much until last week, when Claude 3.5 Sonnet was released.
This is not just another incremental update in the AI world: "Claude 3.5 Sonnet is now the most capable, smartest, and cheapest model available on the market today,” according to Daniela Amodei, co-founder of Anthropic, who spoke to VentureBeat about the release.
Now, of course, a founder would say that.
Still, Daniela’s bold claim was backed by impressive benchmarks: Out of seven standard intelligence and capability metrics, Claude 3.5 Sonnet outperforms its competitors on six and comes close on the seventh. In vision-related tasks, it leads in four out of five standard metrics.
You can see how over time, the entity category of LLMs has done better on these benchmarks, with Claude 3.5 Sonet currently being the one to rule them all:
While benchmarks are always a point of much discussion and contention, and even Anthropic says they don’t mean much to companies, Sonet 3.5 promises to deliver several new benefits powered by its more capable model:
1. State-of-the-art vision recognition.
The AI Advantage demos this nicely with a very confusing traffic sign, which Claude nails in understanding, but it fails (supposedly due to safety training) to find Waldo:
2. Better interface and “Artifacts.”
Claude boasts a significantly nicer user interface than ChatGPT, especially with the new Artifacts feature, which interactively brings code or content to life. In this demo, Anthropic demos prompt-to-working game:
(You got to love the “Good evening, Sam” here. Shots fired!)
Being able to ‘code’ in completely human language will further some people’s beliefs that everyone becomes a coder, especially because there’s no longer any back-and-forth between the interface and the result.
3. Team Mode and Enterprise focus.
While competitors like OpenAI have primarily targeted consumers, Anthropic has tailored its offering to the specific requirements of businesses.
The model excels in areas crucial for enterprise applications, such as graduate-level reasoning, code generation, and multilingual math.
But most impactfully, Claude also just launched “Projects,” which allows teams to collaborate on conversations with an AI trained based on team data.
For example, a team of marketers can now co-develop projects with Claude within a shared team space that knows their marketing strategy.
4. Continued Superior Writing.
Claude continues to write better and more creatively. It usually doesn’t sound like the typical AI-style writing we know from ChatGPT and others. However, in my test, where I let it draft this newsletter, it did throw in the mother of all cliches:
Still, overall, Claude wrote a longer first draft that sounded nicer and was better structured than ChatGPT 4o. For me, Claude continues to dominate in writing and editing.
(See the full side-by-side of Claude Sonet 3.5 vs. ChatGPT 4o in drafting this article here. It’ll also give you a nice insight into how much AI helps me in creating content!)
5. Less Prompt Engineering.
Being good at prompting still can make a 10x difference, but Claude may show where we’re heading. As AI professor Ethan Mollick states, Claude responds very well to a simple “Make it Better.”
The AI Advantage points out that this is one way the LLMs will write the prompts for you, making us less reliant on learning how to prompt.
Claude versus ChatGPT
So with all these impressive features, especially the outperformance on benchmarks, is it time to leave ChatGPT for Claude in your company?
Well, if you’ve trained people on ChatGPT, then the gains of Sonnet 3.5 will not be worth the switch. Additionally, I expect ChatGPT to be upgraded.
Like the question of Copilot vs ChatGPT, much of this is about what your teams are used to and what you’ve invested in so far.
Additionally, ChatGPT still beats even the newest Claude model in several areas:
- Additional features: ChatGPT-4o offers image generation, Voice Mode, GPTs, Memory/Custom Instructions, and conversation sharing, which Claude 3.5 Sonnet currently lacks. While “Projects” have a version of GPTs, I don’t see them competing head-on just yet, for example, because they cannot be shared publicly.
- Language support: ChatGPT supports 95+ languages, while Claude is limited to English, Japanese, Spanish, and French.
- Voice mode: ChatGPT has announced (but not implemented) a rigorously better voice mode, which can be one of the biggest unlocks of AI benefits, as we often discuss in the Lead with AI community.
The Bottom Line: Impressive, But No Game-Changer
I won’t dismiss Anthropic's release of Claude 3.5 Sonnet as a significant milestone in the AI landscape for companies.
But while it outperforms competitors like ChatGPT-4o on several benchmarks, the decision to switch platforms isn't straightforward:
- Performance Leap: Claude 3.5 Sonnet shows impressive capabilities, particularly in vision recognition, creative writing, and enterprise-focused tasks, but the practical differences versus ChatGPT may be too subtle to warrant a switch.
- Enterprise Focus: Anthropic's approach to design for businesses, including the new "Projects" feature for team collaboration, is powerful but could be replicated quickly by ChatGPT and Copilot with relative ease.
- Ease of Use: The improved UI and Claude’s responsiveness to simple prompts like "Make it Better" may reduce the learning curve for AI adoption in your company, but I believe that at-work preferences will continue to mirror what people use at home, which is predominantly ChatGPT.
In short, Claude has delivered many great upgrades that may convince more people to try it alongside ChatGPT.
It remains to be seen whether it’s enough to deliver a true competitive advantage, but as someone studying AI closely, you should try it out for yourself.
Until next week,
– Daan
Future Work
A weekly column and podcast on the remote, hybrid, and AI-driven future of work. By FlexOS founder Daan van Rossum.
Our latest articles
FlexOS helps you stay ahead in the future of work.