Claude Mythos: Anthropic's Leak That's Too Dangerous to Release

Ejaaz:
Three weeks ago, rumors broke that a major AI lab had built a model more powerful,

Ejaaz:
more dangerous, and more expensive than any AI model that we had seen before.

Ejaaz:
We didn't know which model lab it would be. We didn't know what the model was called.

Ejaaz:
And then just a few days ago, Anthropic leaked a model called Claude Mythos,

Ejaaz:
which is supposedly more powerful than any model that they've ever built before,

Ejaaz:
a tier above Opus 4.6, which is what we see today.

Ejaaz:
This model is actually so good that it is considered a cyber security threat

Ejaaz:
and can't be rolled out to the public just yet.

Ejaaz:
But it's not just Anthropic that's building a model that is close to AGI like this.

Ejaaz:
OpenAI has a model codenamed Spud, Google has a model codenamed Agent Smith,

Ejaaz:
and there's many more to come this year.

Josh:
But the Anthropic leak wasn't intentional. This was discovered by accident last

Josh:
Thursday, March 26th, by a Fortune reporter who discovered that Anthropic's

Josh:
content management system had a configuration error.

Josh:
And for those who aren't familiar, the content management system,

Josh:
it's how the web server serves files.

Josh:
And within that, there is a config error that leaked nearly 3,000 unpublished

Josh:
assets sitting in this publicly searchable database.

Josh:
Anyone could find them. So two independent security researchers,

Josh:
they went through, they confirmed.

Josh:
And among these files were two blog posts of two models named Claude Mythos

Josh:
and a new tier named Capybara.

Josh:
Anthropic immediately removed access to all of

Josh:
this as soon as it came out but then later on an anthropic spokesperson

Josh:
confirmed that it represents a step change in

Josh:
ai performance and is the most capable model we've ever built so they confirmed

Josh:
what we're seeing here is real now the problem is like this image suggests on

Josh:
screen we're missing a lot of information this is a leak that something like

Josh:
this exists but we don't we're not sure exactly what what we do know is that

Josh:
there is the new model tier, Ejaz, like you mentioned, named Capybara.

Josh:
It is the new tier that sits above Opus. So now the lineup will kind of look

Josh:
like Haiku, Sonnet, Opus, and then Capybara at the top.

Josh:
It doesn't really sound quite right. Maybe that's an experimental name.

Josh:
They might find something better.

Josh:
And then Mythos is the specific model name within that tier.

Josh:
So you can think of Capybara as the weight class and Mythos as like the fighter.

Josh:
It's the specific model.

Josh:
Now, according to the leaked documents, this dramatically outperforms Claude

Josh:
Opus 4.6 on basically everything, but particularly coding, academic reasoning,

Josh:
and the cybersecurity benchmarks.

Josh:
And I think the cybersecurity one is one of the more interesting points here,

Josh:
because it's so powerful as cybersecurity that one of the main reasons why they

Josh:
can't release it is to actually prevent people from using it maliciously. Is that right?

Ejaaz:
Yeah. So actually, if we rewind to about a month and a half ago,

Ejaaz:
Anthropik's head of AI security, who's actually a legend in the industry,

Ejaaz:
gave a talk about Claude Opus 4.6, when it had just released.

Ejaaz:
And his talk described how the model was pointed at five to 10 very popular

Ejaaz:
open source code bases with no instructions given.

Ejaaz:
And what the model did was very, very interesting.

Ejaaz:
It scanned all those code bases and discovered 500 major security flaws.

Ejaaz:
Expert human AI security researchers couldn't discover in decades that they'd

Ejaaz:
been staring and using these exact code bases.

Ejaaz:
So Claude did in a couple of hours what many security researchers couldn't do.

Ejaaz:
We're talking about like millions of compute hours and time spent staring at

Ejaaz:
these code bases, testing it.

Ejaaz:
Claude, Opus 4.6 managed to figure this out. Now, this created a lot of excitement,

Ejaaz:
but also a lot of concern.

Ejaaz:
Now, because these AI security researchers had a good heart,

Ejaaz:
they weren't using this maliciously.

Ejaaz:
But if you could imagine that if this model had been placed to,

Ejaaz:
say, a malicious actor, they could have exploited these for many different reasons.

Ejaaz:
And so these exploits were surfaced and they were fixed. But the question now

Ejaaz:
becomes, what if a more powerful model was made more readily available to anyone

Ejaaz:
or an attacker, for example, a foreign adversary that could discover and exploit any future bugs?

Ejaaz:
That's the concern that's around that i have personally around

Ejaaz:
clode mythos or capybara this model is supposedly

Ejaaz:
meant to be a tier above anything that we've ever seen before apparently it

Ejaaz:
is amazing at discovering and exploiting exploits

Ejaaz:
so if it is let's say two orders of magnitude let's be conservative two orders

Ejaaz:
of magnitude better than opus 4.6 we could have a real problem on our hands

Ejaaz:
and so what anthropic has done now is they've started to slow release this secret

Ejaaz:
model mythos and capybara to cybersecurity experts first.

Ejaaz:
Why? Because they want them to figure out how they can harden their own defense

Ejaaz:
systems before they publicly release this model.

Ejaaz:
And someone, maybe a nefarious attacker might use it for unachievable gain.

Josh:
I think it's ironic that the company building what it describes as an AI with

Josh:
unprecedented cybersecurity capabilities leaked it because someone misconfigured their blog.

Josh:
Like the irony there is too strong. And you have to wonder, you have to really

Josh:
ask yourself the question, well, what if this model so smart that it's leaking

Josh:
itself if it's like poking holes to like let people secretly find it i don't know

Josh:
The one thing for sure is that, one, this model is going to be incredibly expensive

Josh:
to run currently, at least.

Josh:
That's part of the reason why we're not seeing it now. But the second is it's

Josh:
going to be unbelievably powerful.

Josh:
And the progress that we've had in the last year is going to probably look like

Josh:
nothing compared to what we're going to get for the next three quarters.

Josh:
The market also very much felt the effects of this because, oh my God,

Josh:
these stock charts look absolutely horrendous.

Ejaaz:
Yeah, CrowdStrike, which is like the major cybersecurity firm,

Ejaaz:
was down a couple billion on the news.

Ejaaz:
And Palo Alto Networks, which is another similar company that competes in this

Ejaaz:
firm, also suffered from this.

Ejaaz:
Now, these two charts that I'm looking at right now for these specific companies,

Ejaaz:
Josh, gives me a little PTSD or deja vu.

Ejaaz:
Because we were talking about this, I think, four weeks ago when Anthropic released

Ejaaz:
their security review clawed feature.

Ejaaz:
Which, you know, wasn't anything to do about Mythos, but basically helped review

Ejaaz:
the Vibe code that you produced using Claude. And so cybersecurity stocks dumped again.

Ejaaz:
This is happening seemingly on a monthly basis at this point.

Josh:
Even though these charts are down quite a bit, I'm not sure how concerned the

Josh:
market needs to be immediately because it appears as if this new model that's coming,

Josh:
this new cybersecurity specialist is really compute intensive,

Josh:
so much so that it's almost going to be impossible for them to run across all

Josh:
the accounts currently without some serious compression and iteration and figuring

Josh:
out how to run this more optimally.

Josh:
And it seems like we're starting to see those growing pains, right? It's like,

Josh:
As they're training models like this, as they're running them on their own servers,

Josh:
it's starting to affect the average user.

Josh:
I know sometimes I'll wake up and I'll feel like my opus is running a little

Josh:
bit dumber than it was the day before. And we actually have data that backs this up.

Ejaaz:
Yeah, so basically over the weekend, Clawed servers basically went down or were majorly impaired.

Ejaaz:
There were a bunch of different outages. People were reporting very,

Ejaaz:
very reduced quality in their interactions with Clawed.

Ejaaz:
And this has been kind of like a repeating trend over the last couple of weeks.

Ejaaz:
And now we might have the answer why.

Ejaaz:
Typically, major AI labs, the last public bit of information that we had was

Ejaaz:
from OpenAI's 2025 run of a major model.

Ejaaz:
They dedicated 30% of their available compute to a training run.

Ejaaz:
Now, the rumors state that for Claude Mythos, they've dedicated even more and

Ejaaz:
that's like the major architectural breakthrough that they've made.

Ejaaz:
If they've done that, that might be the reason why we aren't being able to use

Ejaaz:
the best version of Claude as consumers because they're too busy using the compute

Ejaaz:
to train the next step or tier in model.

Ejaaz:
I don't know if this is a good or bad thing, but one thing it definitely like

Ejaaz:
screams at me is like, we need a ton more compute.

Josh:
Big time. And it's amazing to think about how far we've come just in the last

Josh:
three months leading up to this moment here.

Josh:
I mean, when you think about over the winter break is when people really start

Josh:
to take vibe coding seriously.

Josh:
And since then, companies have gone from a very small percentage of code to almost 100% of code.

Josh:
I mean, this is saying 80% plus of all code deployed is written by CloudCo just for Anthropic.

Josh:
It's unbelievable we started with opus 4.5 which

Josh:
was released in november and then opus 4.6 came

Josh:
in february which took us from a 200 000 token contacts

Josh:
went into a million and now whatever this new thing is is going to really drive

Josh:
up the coding capabilities in a really big way and i think it's probably worth

Josh:
checking in on which model is going to be the strongest model which company

Josh:
has the best model through the end of june and thanks to polymarket we have

Josh:
some interesting stats on this.

Josh:
So the people are betting that Anthropic has a 66% chance of having the best

Josh:
AI model in June, which is huge.

Josh:
And that number has increased very significantly recently. If you look just

Josh:
back in February, it was Google who was the heavy favorite with a almost 80%

Josh:
chance or 70% chance of having the best model.

Josh:
That has changed recently in a big way, perhaps because of this leak.

Josh:
But I'm not sure if this is fully up to date and

Josh:
it may be missing some information because we have some news on open ai and

Josh:
google who are planning to release something really important too and thank

Josh:
you for probably for sponsoring that part of the show but let's talk about open

Josh:
ai there's a new code name spud model that's coming and this is probably going

Josh:
to be the mythos competitor so what is this looking like yeah

Ejaaz:
Um that's the issue we don't really know all of these models we don't have the

Ejaaz:
the specs we need the specs to talk about them.

Ejaaz:
There's a few trends or patterns that are happening amongst the hottest,

Ejaaz:
or should I say, top two or three AI labs.

Ejaaz:
We've got Anthropic Releasing Mythos, which is their AGI or pre-AGI model,

Ejaaz:
a massive, massive leap ahead.

Ejaaz:
OpenAI is working on the same thing. They've been secretively working on a larger model.

Ejaaz:
This has gone through a few different names. If you remember,

Ejaaz:
Josh, by the end of the year, I think it was referred to as codename Sprout.

Ejaaz:
And now it's referred to as Spud. So I don't know if that implies that it's

Ejaaz:
grown massively since then.

Ejaaz:
It's growing. But these models are supposedly meant to be anywhere between 10

Ejaaz:
to 20 trillion parameter models.

Ejaaz:
Now, for context, the largest models that we currently look at right now is

Ejaaz:
between one to two trillion.

Ejaaz:
So this is a major order of magnitude larger model.

Ejaaz:
They're going to be compute intensive. They're going to be very expensive to serve.

Ejaaz:
So we need to figure out how to scale AI infrastructure and a bunch of other things.

Ejaaz:
But OpenAI's model is codenamed Spud, and it's meant to be the competitor to

Ejaaz:
Mythos. People are anticipating that it might be something like GPT 5.5 or rather GPT 6.

Ejaaz:
So again, a tier above what we see today. It's going to be advanced in coding,

Ejaaz:
reasoning, and a lot of the things Anthropics is as well.

Ejaaz:
When I look at this, Josh, personally to me, this seems to be,

Ejaaz:
one, a massive bid to try and leapfrog each other.

Ejaaz:
And number two, maybe try and juice their numbers ahead of a potential IPA.

Ejaaz:
I don't know whether your reaction to this is the same, but that's like my gut

Ejaaz:
reaction when I read news like this.

Josh:
Yeah, it's probably both. They want to juice up things before the IPO,

Josh:
but they also just want to win.

Josh:
And I have some pretty strong speculations just based on vibes of what this is going to look like.

Josh:
I think we've been seeing this recent convergence around OpenAI,

Josh:
particularly on focus and on really dialing in what they're focused on.

Josh:
And we saw a big move last week when they removed Sora. They totally destroyed Sora.

Josh:
They moved a lot of the teams together. They made their chief of product,

Josh:
um the chief of like agi release and it appears as if they're building a mega

Josh:
app based on the rumors so

Josh:
Part of the reason why I have a difficult time using OpenAI's products is there's

Josh:
kind of spread out everywhere.

Josh:
There's like the Sora app was one, there's Codex, then there's their browser,

Josh:
then there's ChatGPT, and there's a lot of different software.

Josh:
And the same is true with their models, or it was at least, where there was

Josh:
GPT 5.3 Codex, and there was 5.3 High, Mid, Low.

Josh:
There's all these different models that really complicate things and confuse things.

Josh:
With 5.4, they made a singular model. Now 5.4 does your coding and it does the reasoning all in one.

Josh:
And what I suspect with this new model, Codename Spud, is going to be the kind

Josh:
of pinnacle of this focus, where I'm hoping they release this with their new

Josh:
application, with a singular model.

Josh:
So there's one model that is all-knowing. There's one application,

Josh:
similar to what Anthropic does with the Cloud Desktop app, that has all of the

Josh:
functionality under one roof.

Josh:
And I think they're going to probably use this as a point to really...

Josh:
Lean into that focus instead of distributing this across a lot of different areas.

Josh:
And I'm hopeful that that will meaningfully change OpenAI more so than it'll

Josh:
change Anthropik because it actually changes the way that users interface with

Josh:
the product and it becomes a much better product.

Ejaaz:
Yeah, I think for the majority of last year, I was pretty upset with the way

Ejaaz:
that SAM and OpenAI were focusing on so many different things.

Ejaaz:
I was just like, just focus on creating a really good model.

Ejaaz:
You're being left behind in coding, Anthropik's eating your lunch, like figure this out.

Ejaaz:
And then since their code read of like, what was it, November last year,

Ejaaz:
they've been like reallocating compute, money, data, and all their resources

Ejaaz:
to focus on building the best general model and the best coding model.

Ejaaz:
So we're starting to see the fruits of that labor.

Ejaaz:
I have a lot of faith now in OpenAI that they're going to produce a really good

Ejaaz:
product that will compete with the likes of Anthropic, which have been eating their lunch.

Ejaaz:
When I look at like the last week, it seems like it's pretty negative for OpenAI.

Ejaaz:
You mentioned that they killed Sora.

Ejaaz:
They also killed the $1 billion deal that they had signed with Disney.

Ejaaz:
And they also shut down ChatGPT adult mode and a bunch of like consumer shopping

Ejaaz:
apps and their like app marketplace as well.

Ejaaz:
They're just focused on these few things right now.

Ejaaz:
But then the other thing is Sam is also kind of defaulting on a few of the major

Ejaaz:
GPU and data center deals, right?

Ejaaz:
So we had the OpenAI and Oracle Abilene deal fall through where they couldn't

Ejaaz:
finance it for a variety of different reasons.

Ejaaz:
Then the other thing is they're defaulting on purchasing up to 40% of the world's

Ejaaz:
memory supply because they haven't figured out their finances right now.

Ejaaz:
So I think that OpenAI is going through kind of like a puberty period where

Ejaaz:
they're figuring their stuff out and where to reallocate resources.

Ejaaz:
But I think they're going to pull through.

Josh:
And it also seems like this is indeed a serious breakthrough.

Josh:
I mean, Sam, in an internal memo that got leaked out to employees,

Josh:
he said things are moving faster than many of us expected.

Josh:
And he called it a very strong model that can really accelerate the economy.

Josh:
That seems like pretty large claims to make internally with employees who are

Josh:
also kind of in the know and aware of what's going on.

Josh:
And I just think that a lot of us who are sitting outside these labs are not

Josh:
entirely wrapping our head around how much progress is actually about to hit

Josh:
us over the next couple of months with these new model releases.

Josh:
It seems like they're step function improvements.

Josh:
And one of the employees from OpenAI actually hinted that Spud contains a capability

Josh:
that is very different from what we've seen before. So while there aren't specifics,

Josh:
there are clearly a lot of these huge novel breakthroughs incoming,

Josh:
which is worth looking out for.

Josh:
There's one final model release, model leak that we have from Google,

Josh:
who has been doing well, kind of chugging along slowly in the background.

Josh:
And this is called Agent Smith.

Josh:
It's a secret AI tool. Do you have any information on this one, EJS?

Ejaaz:
Yeah, so there was like a leaked report from an insider at Google.

Ejaaz:
Apparently, Google employees are using a new internal tool called AgentSmith

Ejaaz:
that can automate tasks such as coding, according to three people that were familiar with it.

Ejaaz:
The way that this product is supposed to work is within their Vibe coding platform

Ejaaz:
called Antigravity, which exists today but hasn't really had a major upgrade

Ejaaz:
for, let's say, a couple months now, which is like an eternity in the AI world.

Ejaaz:
So they're releasing a new AI model called Agent Smith that is supposed to take

Ejaaz:
a multi-agent approach and use an upgraded version of Gemini 3.1.

Ejaaz:
So it's probably not going to be 3.1. It might be 3.5 or maybe even 4.

Ejaaz:
Again, another order of magnitude leap up. So what we're seeing here is Google

Ejaaz:
working on an AI coding model competitor to try and catch up to Anthropic and

Ejaaz:
the likes of OpenAI's codecs.

Ejaaz:
You've got OpenAI trying to reallocate resources and focus on building the best

Ejaaz:
general model and catch up with Anthropic, which they have at coding.

Ejaaz:
Then you have Anthropic trying to keep these two at bay and make the next order

Ejaaz:
of magnitude up spending all their compute but coming at the expense of serving

Ejaaz:
their existing users which they're adding like a million a day reporting you know

Ejaaz:
Claude servers being down and reduced quality of usage. So this is a very,

Ejaaz:
I can like feel the tension in the air between these three companies right now.

Ejaaz:
I don't know what Mets is doing.

Ejaaz:
I don't know where Grok is. I'm rooting for them. I hope they catch up.

Ejaaz:
But it seems to be these three major competitors right now that are in the running for winning this race.

Josh:
They're firing. I mean, in the last 90 days, since we started this year to now,

Josh:
we went from 200,000 context windows to a million.

Josh:
We went from these coding assistants to compiler writing

Josh:
agents who are completely capable of writing a very small amount

Josh:
now over a quarter of google's production software and 80 plus

Josh:
of anthropic software everything we learned this week the frontier is

Josh:
going to keep moving faster and faster so we're in

Josh:
for a crazy q2 q3 q4 just a

Josh:
crazy 2026 and as all these things happen as these

Josh:
ipos start to happen and they get even more fundraising to deploy

Josh:
these ai data centers at scale things are really going

Josh:
to get weird in a hurry but we will be here to cover it as always um if you

Josh:
enjoyed this episode please don't forget share it with your friends uh like

Josh:
it on youtube don't forget to subscribe if you listen on a podcast player like

Josh:
spotify or rss you could rate us five stars there it's always really appreciated

Josh:
you just any final notes before we sign off for the day we've

Ejaaz:
Been absolutely killing it on our side uh loads of new subscribers loads of

Ejaaz:
new listeners thank you guys so much for for joining us um and yeah i have a

Ejaaz:
request because we always like to give out homework at the end of the episode

Ejaaz:
um if you're listening to this and you are a insider at anthropic open air or

Ejaaz:
google and you are willing to give an anonymous tip to our accounts,

Ejaaz:
please spin up an Anon account on x slash Twitter and DM us.

Ejaaz:
I would love to hear from you.

Josh:
That'd be great. Well, yeah, thank you guys for watching. We'll see you in the next one.

Claude Mythos: Anthropic's Leak That's Too Dangerous to Release
Broadcast by