OpenRouter: The Only AI Tool You'll Ever Need | Founder Alex Atallah

Ejaaz:
What if I told you there was a single website you could go to where you can

Ejaaz:
chat to any major AI model from one single interface?

Ejaaz:
It's kind of like chat GPT, but instead every prompt gets routed to the exact

Ejaaz:
AI model that will do the best job for whatever your prompt might be.

Ejaaz:
Well, on today's episode, we're joined by Alex Atala, the founder and CEO of Open Router AI.

Ejaaz:
It's the fastest growing AI model marketplace with access to over 400 LLMs,

Ejaaz:
making it the only place that really knows how people use AI models,

Ejaaz:
and more importantly, how they might use them in the future.

Ejaaz:
It's at the intersection of every single prompt that anyone writes and every

Ejaaz:
model that they might ever be.

Ejaaz:
Alex Atala, welcome to the show. How are you, man?

Alex:
Thanks, guys. Great. Thanks so much for having me on.

Ejaaz:
So it is a Monday. How does the founder of OpenRouter spend his weekend?

Ejaaz:
Presumably you know out and about chilling relaxing not at all focused on the company oh

Alex:
I usually i love weekends with no

Alex:
meetings planned and i just go to a coffee shop and just have tons of hours

Alex:
stacked in a row uh to do things that require a lot of momentum build up so

Alex:
i did that at coffee shops on saturday and sunday and then i watched blade runner again.

Ejaaz:
Again okay um well so

Ejaaz:
when we were preparing for this episode alex

Ejaaz:
um i couldn't help but think that you've had a pretty insane decade of startup

Ejaaz:
foundership right um so open router is kind of like your second major thing

Ejaaz:
that you've done but prior to doing that you were the founder and cto of OpenSea,

Ejaaz:
the biggest NFT marketplace out there.

Ejaaz:
And now you're focused on one of the biggest AI companies out there.

Ejaaz:
So it sounds like you're at kind of like the pivot point of two of the most

Ejaaz:
important technology sectors over the last decade.

Ejaaz:
Can you just give us a bit of background as to how you ended up here?

Ejaaz:
And more importantly, where you started.

Ejaaz:
Walk us through the journey of OpenSea and how you ended up at OpenRouter AI.

Alex:
Yeah, so I co-founded OpenSea with Devin Finzer the very beginning of 2018, very end of 2017.

Alex:
It was the first NFT marketplace. And...

Alex:
It was not dissimilar to OpenRouter in that there was a really fragmented ecosystem

Alex:
of NFT metadata and media that gets attached to these tokens.

Alex:
And it was the first example of something in crypto that could be non-fungible,

Alex:
meaning it's a single thing that can be traded from person to person.

Alex:
Most things in the world are non-fungible. A chair is non-fungible. a

Alex:
currency is fungible so it was

Alex:
back back in 2018 no

Alex:
one was really thinking about crypto in terms of non-fungible goods

Alex:
and uh and the problem with the non with

Alex:
non-fungible goods is that there weren't any real standards set up

Alex:
um there was a lot of heterogeneous like

Alex:
implementations for how to get uh like

Alex:
a non-fungible item represented and tradable in a decentralized way So OpenSea

Alex:
organized this like very heterogeneous inventory and put it together in one

Alex:
place. We came up with like a metadata standard.

Alex:
We did a lot of like a lot of work to really make the experience super good for each collection.

Alex:
And you see a lot of those a lot of similarities with how AI works today,

Alex:
too, where there's also just a very heterogeneous ecosystem.

Alex:
On a lot of different APIs and different features supported by language model providers.

Alex:
And Open Router similarly does a lot of work to organize it all.

Alex:
Um i was at open sea uh until 2022 um when i was kind of feeling the itch to do something new.

Alex:
And um i'm at the very end of i left in august and then chat gpt came out a few months later.

Alex:
And uh and my biggest question around that

Alex:
time was whether it was going to be a winner take all market

Alex:
because opening i was very far ahead of

Alex:
everybody else and um you know

Alex:
we had cohere command we had a couple open source models um

Alex:
but opening i was the only really usable one i

Alex:
was doing little projects to experiment

Alex:
with the gpt3 api and uh

Alex:
and then llama came out in january um really

Alex:
exciting about a tenth the size one on a

Alex:
couple benchmarks but it wasn't really chattable yet and

Alex:
uh and it wasn't until uh a few

Alex:
months later that somebody a team at

Alex:
stanford distilled it into a new

Alex:
model called alpaca um distillation means

Alex:
you you take the model and you customize it or fine tune it

Alex:
on a set of synthetic data that they

Alex:
made using chat gpt as a research project

Alex:
and uh and that was it was

Alex:
the first successful major distillation that i'm aware

Alex:
of um and it was an actually usable model i

Alex:
was like on the airplane talking to him i was like wow this

Alex:
is if it only took six hundred dollars to make something like this then you

Alex:
don't need ten million dollars to make a model there might be like tens of thousands

Alex:
hundreds of thousands of models in the future and suddenly this started to look

Alex:
like a new like economic primitive a new building block that people that kind

Alex:
of deserve their own place on the internet.

Alex:
And there wasn't one. There wasn't a place where you could discover new language

Alex:
models and see who uses them and why.

Alex:
And that's how OpenRouter got started.

Josh:
That's amazing. So one of the things that we're obsessed with on this channel

Josh:
in particular is exploring frontiers and how to properly see these frontiers

Josh:
and analyze them and understand when they're going to happen.

Josh:
And when I was going through your history, you have this talent consistently over time.

Josh:
And even as far back as early on, I read you were hacking Wi-Fi routers in a hackathon.

Josh:
You're very early to that. You were early to the NFTs. You were early to understanding

Josh:
AI and the impact that it would have. And what I'd love for you to explain is

Josh:
the thought process and the indicators you look for when exploring these new

Josh:
frontiers, because clearly there's some sort of pattern matching going on.

Josh:
Clearly you have some sort of awareness of what will be important and why it

Josh:
will be important, and then inserting yourself into that narrative.

Josh:
So are there patterns? Are there certain things that you look for when searching

Josh:
for these new opportunities and that led you to make these decisions that you have?

Alex:
I think there's there's a lot to be said for finding enthusiast communities

Alex:
and and seeing if you're going to join it.

Alex:
Like, can you be an enthusiast with them?

Alex:
Like whenever something new comes out that has like some kind of ecosystem potential,

Alex:
there's there are going to be enthusiast communities that pop up.

Alex:
And the Internet has made it self-certain. You could just join the communities.

Alex:
Um discord i think is a incredible

Alex:
and super underrated platform because

Alex:
the communities feel kind of private you're

Alex:
like getting you don't feel like you're you know

Alex:
seeing somebody trying to get s you

Alex:
know like advertise something for seo juice there's

Alex:
no seo juice in discord um it's it's

Alex:
just people talking about what they're passionate about and and it

Alex:
goes it gets really niche um and when

Alex:
you find a like an interest group in discord that

Alex:
like has to do with some some new

Alex:
piece of technology that's just being developed right now and doesn't really

Alex:
work very well at all um you get people who are just trying to figure out what

Alex:
to do with it and how to make it better and i think that's like that's the first

Alex:
core piece of magic that jumps to mind,

Alex:
there's got to be like a willingness to be weird because if you jump into any

Alex:
of these communities at face value it's stupid.

Alex:
Like oh this is like just a game or it's like a really weird game I mean I'm

Alex:
not really a collectible game so I'm going to leave right now and yeah.

Alex:
Not only do you have to be aware, but you have to be creative.

Alex:
Like, okay, these are just cats on the blockchain, and people are just trading cats back and forth.

Alex:
You can't look at the community as simply that.

Alex:
Think about what you could do with it.

Alex:
Like, what is this unlock that wasn't achievable before?

Alex:
Um and uh

Alex:
and and i think there are there are people who

Alex:
just are good who will do this and they'll join the communities

Alex:
and and brainstorm live and you can see everybody

Alex:
brainstorming uh in real time but like

Alex:
another incredible example of this was the mid-journey discord

Alex:
you know it became the

Alex:
biggest biggest server in discord by

Alex:
far uh and you know

Alex:
why did that happened well you could it started with

Alex:
something weird silly maybe

Alex:
not super useful but you could see all the

Alex:
enthusiasts like remixing and

Alex:
brainstorming live how to turn it into something beautiful

Alex:
and how to how to make it useful and um

Alex:
and then you know just explode it like i it's the most it's the it's the most

Alex:
incredible like niche community uh i think that discord has ever seen because

Alex:
of like how useless it started and how insanely exciting it became.

Alex:
So um like i mean i i

Alex:
think i saw big sleep i was like playing around with this model

Alex:
called big sleep in 2021 that uh

Alex:
let you generate images that

Alex:
look kind of like deviant art okay and

Alex:
uh you could see you could like they're all

Alex:
animated images and they none of them really made sense but you could get some

Alex:
really cool stuff not like potentially something you'd want to make your desktop

Alex:
wallpaper and if you're really like deep in some deviant art communities you

Alex:
know you kind appreciate it and so and that that that was like oh there's like

Alex:
a kernel of something here,

Alex:
and uh it took like a like another year or two before mid-journey started to

Alex:
like pick up but that was like.

Ejaaz:
Where were you seeing all of this alex like where were you scouring just random

Ejaaz:
forums or just wherever your nose told you to go

Alex:
But basically there's this twitter account I'm trying to remember what it's

Alex:
called that posts AI research papers and and like kind of tries to show what you can do with them.

Alex:
And I discovered this Twitter account in like 2021.

Alex:
And I.

Alex:
I think it was not it was it wasn't at all like related to crypto but it was

Alex:
a way you know big sleep was like the first thing i saw that used ai to generate

Alex:
things that could potentially be nfts,

Alex:
so i started experimenting around like how how much you could direct it to make

Alex:
an nft collection that would make any sense it was very very difficult um but

Alex:
that was how uh that was like the first generative and.

Ejaaz:
This was before you were even thinking about starting open router right

Alex:
Um yeah yeah this was back this was when i was

Alex:
full-time at openc um oh is

Alex:
yeah i got the it's a colic

Alex:
this twitter account all right

Alex:
i really recommend it they basically post papers and like explainate and explore

Alex:
how this paper gets useful um they post animations uh like they make they make

Alex:
ai research like kind of fun to engage with and that was that was my first experience.

Ejaaz:
Okay, so I mean that's a massive win for X or formerly as it was known back

Ejaaz:
then, Twitter as a platform, right?

Ejaaz:
It gave birth to kind of like two of the biggest technologies crypto,

Ejaaz:
also known as crypto Twitter, and now apparently all the AI research stuff which

Ejaaz:
kind of put you on to the path that led you to OpenRatter.

Ejaaz:
So if I've got this right, Alex, you were full-time at OpenSea with a

Ejaaz:
multi-billion dollar company loads of important stuff to do there,

Ejaaz:
but you still found the time to kind of scour this fringe technology because

Ejaaz:
that's what AI was at the time.

Ejaaz:
Prior to kind of GPT-2 or GPT-3, no one really knew about this.

Ejaaz:
And you were playing around with these gen AI models, these generative AI models

Ejaaz:
that would create this magical little substance and maybe it came in the form

Ejaaz:
of a pitcher or a weird little cat.

Ejaaz:
And you kind of like jumped into these niche forums of enthusiasts,

Ejaaz:
as you say, and kind of explored that further.

Ejaaz:
And it sounds like you kind of like honed that even beyond your journey from OpenSea when you left.

Ejaaz:
I remember actually meeting you in this kind of like this abbess between you

Ejaaz:
leaving OpenSea and starting OpenRouter where you were kind of brainstorming

Ejaaz:
a bunch of these ideas. And I remember a snippet from our conversation

Ejaaz:
In like one of the WeWorks here, where you just kind of like had whiteboarded a bunch of AI stuff.

Ejaaz:
And one of those things was kind of like the whole topic of inference.

Ejaaz:
And if I'm being honest with you, I had no idea what that word even meant back then.

Ejaaz:
I was extremely focused on all the NFT stuff and all the crypto stuff,

Ejaaz:
my background's in all of that.

Ejaaz:
But I just found that fascinating that you always had your nose in some of the

Ejaaz:
early communities. And I think that's a really important lesson there.

Ejaaz:
I want to pick up on something that you actually brought up when you said you

Ejaaz:
discovered kind of like your path to open router, Alex.

Ejaaz:
And that is, you said you were playing around with these early AI models.

Ejaaz:
So not the GPTs before Claude was even created.

Ejaaz:
You're playing around with these random models that you would find either on

Ejaaz:
forums, on Twitter, or on Reddit, right? and you would experiment with them.

Ejaaz:
And I find it fascinating that back then, even when GPT became a thing,

Ejaaz:
you were convinced that there would be hundreds of thousands,

Ejaaz:
or did you say hundreds of thousands of AI models?

Ejaaz:
Back then, that wasn't a normal view.

Ejaaz:
Back then, everyone was like, you need hundreds of millions of dollars.

Ejaaz:
Maybe it was tens of millions of dollars back then. And it was going to be a rich man's game.

Alex:
Yeah, it was basically the Alpaca Project that kind of put me over the sack.

Alex:
On there being many, many, many models instead of just a very small number.

Ejaaz:
And can you explain what the Alpaca project is for the audience? Yeah.

Alex:
So the Alpaca project, after Lama came out, you really could not chat with it

Alex:
very well. It was a text completion model.

Alex:
There were a couple benchmarks where it beat GPT-3.

Alex:
And... It was about a tenth the size of what most people thought GPT-3 was sized at.

Alex:
So it was a pretty incredible achievement.

Alex:
But it wasn't really like, the user experience wasn't there.

Alex:
And the Alpaca project took ChatGPT and generated a bunch of synthetic outputs.

Alex:
And then they fine-tuned Llama on those synthetic outputs.

Alex:
And this did two things to Llama. It taught it style, and it taught it knowledge.

Alex:
It taught it, like, the style is like how to chat, which was the big user experience gap.

Alex:
And it made it smarter.

Alex:
Like, you can, fine-tuning transfers both style and knowledge.

Alex:
And the model would, like, respond to things that it had, you know,

Alex:
like, the content of the synthetic data, like, was reflected in the model's

Alex:
performance on benchmarks after that point.

Alex:
So um so if you can do

Alex:
that without revealing all

Alex:
the data that goes in um now now

Alex:
there's like a way you could sell data via api without

Alex:
like like just dumping all the data out to the world and then never being able

Alex:
to to like monetize it again so there's like a brand new business model around

Alex:
data that emerges um yet like the ability to create just like work towards open intelligence,

Alex:
and uh and build like new

Alex:
architectures test them more quickly and and and

Alex:
uh uh fine-tune them quickly basically you

Alex:
can build on top of the work of giants i mean

Alex:
you don't have to start from zero every time a lot

Alex:
of like the biggest developer experience innovations just involve like giving

Alex:
developers a higher stair to start walking up so they don't have to start at

Alex:
the bottom of the staircase every single time um and you know that was like the the the big.

Alex:
Like generous give that llama had for the community um and it wasn't you know

Alex:
that wasn't the only company doing open source models, Mastral,

Alex:
came out with 7B Instruct a few months later. It was an incredible model.

Alex:
Then they came out with the first open-weight mixture of experts a few months later.

Alex:
It felt like actual intelligence, but completely open.

Alex:
And all of these provide higher and higher stairs for other developers to kind

Alex:
of like, basically to crowdsource new ideas from the whole planet.

Alex:
Uh and and let these new ideas build on

Alex:
top of really good foundations so and

Alex:
you know when that when that like whole picture started

Alex:
to form into place um it felt like okay this is going to be like a huge inventory

Alex:
situation you kind of like nft collections were a huge inventory situation obviously

Alex:
completely different really different market dynamics really different type

Alex:
of of goal that buyers have.

Alex:
And so a lot of like my early experimentation, like I made like a Chrome extension called Window AI.

Alex:
I did like a few other things were just about learning how the ecosystem works

Alex:
and like what makes it different and how the like, like what people really want,

Alex:
what developers really want.

Josh:
So that leads us to OpenRouter itself, right? So I kind of want you to help

Josh:
explain to the listeners who aren't familiar with OpenRouter what it does.

Josh:
Because I think a lot of people, the way they interact with an AI is they send

Josh:
a prompt to their model of choice.

Josh:
They use ChatGPT or they use the Grok app or they're on Gemini and they kind

Josh:
of live in these siloed worlds.

Josh:
And then the next step up from the people are those kind of who use it professionally,

Josh:
who are developers. They're interacting with APIs.

Josh:
Maybe they're not interfacing with the actual UI, but they're calling a single model.

Josh:
And OpenRouter kind of exists on top of this, right? Can you walk us through

Josh:
how it works and why so many people love using OpenRouter?

Alex:
Open Router is an aggregator and marketplace for large language models.

Alex:
You can kind of think of it as like a Stripe meets Cloudflare for both of them.

Alex:
It's like a single pane of glass. You can orchestrate, discover,

Alex:
and optimize all of your intelligence needs in one place.

Alex:
One billing provider gets you all the models.

Alex:
Uh there's like 470 plus now uh

Alex:
like all the models like they sort of implement features

Alex:
but they do it differently and they also there's

Alex:
a lot of like intelligence brownouts as andre carpoffi calls them yeah where

Alex:
models just go down all the time even the you know even the top models like

Alex:
anthropic and gemini and and open

Alex:
ai um so what we do is you know we like developers need a lot of choice.

Alex:
CTOs need a lot of reliability.

Alex:
CFOs need predictable costs. CISOs need complex policy controls.

Alex:
All of these are inputs to what we do, which is build a single pane of glass

Alex:
that makes models more reliable, lower costs, gives you more choice, and,

Alex:
and then and helps you choose between all the options for where to source your intelligence.

Josh:
How does it work uh because i would imagine like what

Josh:
each as and i on the show we frequently talk about benchmarks right where

Josh:
a certain model is the best at coding and that infers that maybe you should

Josh:
go to that model to do all of your coding needs because it's the best at it

Josh:
but it would appear as if it's not true if you're routing through a lot of different

Josh:
providers so how do you consider which provider gets routed to when and how

Josh:
to get the best result for what you're asking

Alex:
So we've taken a different approach so

Alex:
far which is instead of like focusing on

Alex:
a production router that picks

Alex:
the model for you um we try

Alex:
to help you choose the model so we

Alex:
we build lots we create lots of analytics both on

Alex:
your account and uh and on our

Alex:
rankings page to help you browse and discover the models that

Alex:
like the power users are really using successfully on

Alex:
a certain type of workload um because we

Alex:
think like developers today primarily want to

Alex:
choose the model themselves um switching between all

Alex:
families can result in like a lot like very

Alex:
unpredictable behavior but once you've

Alex:
chosen your model um we try to

Alex:
help developers not need to think about the provider there are

Alex:
like sometimes dozens of

Alex:
providers for a given model uh all kinds

Alex:
of companies including the hyperscalers like aws google vertex and azure um

Alex:
and uh like scaling startups like together fireworks deep infra um and a long

Alex:
tail of providers that provide,

Alex:
like very unique features,

Alex:
very like exceptional performance.

Alex:
There's all kinds of differentiators for them.

Alex:
So what we do is we collect them all in one place. And if you want a feature,

Alex:
you just get the providers that support it.

Alex:
If you want performance, you get prioritized to the providers that have high performance.

Alex:
If you really are cost sensitive, you get prioritized to the providers that

Alex:
are really low cost today. and we basically create all these lanes. There's.

Alex:
Innumerable ways you could get routed but

Alex:
you're in full control of the of the overall user

Alex:
experience that you're aiming for and that's

Alex:
what that's what we found that was missing from the

Alex:
whole ecosystem was just a way of doing that and uh

Alex:
and you know we get like between on average five to ten percent uptime boosts

Alex:
over going to um providers directly just by load balancing and sending you to

Alex:
the top provider that's up and able to handle your request.

Alex:
We really focus hard on efficiency and performance.

Alex:
We only add about 20 to 25 milliseconds of latency on top of your request.

Alex:
It all gets deployed very close to your servers up the edge.

Alex:
We overall get just We stack providers.

Alex:
We figure out what you can benefit from that everybody else is doing and just

Alex:
give you the power of big data as a developer just accessing your model choice.

Josh:
So it kind of allows you to harness the collective knowledge of everybody, right?

Josh:
You get all of the data, you have all of the queries, you know which yields

Josh:
the best result, and you're able to deliver the best product for them.

Josh:
Now, in terms of actual LLMs, EJ has actually pulled this up just before, which is a leaderboard.

Josh:
And I'm interested in how you guys think about LLMs, which are the best,

Josh:
how to benchmark them, and how you route people through them.

Josh:
Is there a specific... Do you believe that benchmarks are accurate,

Josh:
and do you reflect those in the way that you route traffic through these models?

Alex:
In general, we have taken the stance that we want to be the capitalist benchmark for models.

Alex:
What is actually happening?

Alex:
And part of this is that I really think both the law of large numbers and the

Alex:
enthusiasm of power users are really, really valuable for everybody else.

Alex:
Like when you're routing to

Alex:
um like clod in

Alex:
let's say you're routing to clod 4 and you're

Alex:
based in europe um there you

Alex:
know all of a sudden there might be like a huge variance in in throughput from

Alex:
one of the providers and you're only able to detect that if like some other

Alex:
users have discovered it before you and so we route around the provider that's

Alex:
like running kind of slow in Europe and send you,

Alex:
if your data policies allow it,

Alex:
to a much faster provider somewhere else.

Alex:
And that allows you to get faster performance. So, like, um...

Alex:
That's, like, on the provider level, how, like, numbers help.

Alex:
On the, like, model selection level, like, what you see on this rankings page

Alex:
here, power users will, like, when we put up a model, like, we put up a new

Alex:
model today from a new model lab called ZAI,

Alex:
like, the power users instantly discover it.

Alex:
We have this LLM enthusiast community that dives in and really figures out what

Alex:
a model is good for along a bunch of core use cases.

Alex:
The power users figure out which workloads are interesting, and then you can

Alex:
just see in the data what they're doing. And everybody can benefit from it.

Alex:
That's why we open up our data and share it for free on the rankings page here.

Ejaaz:
I'm seeing this one consistent unit across all these rankings,

Ejaaz:
Alex, which is tokens, right?

Ejaaz:
And Josh and I have spoken about this on the show before, but I'm wondering

Ejaaz:
how, like you've chosen this specific unit to measure how good or effective

Ejaaz:
these models are or how consumed or used they are.

Ejaaz:
Can you tell us a bit more as to why you picked this particular unit and what

Ejaaz:
that tells you as like the open router platform as to how a user is using a particular model?

Alex:
Yeah, I think dollars is a good metric too.

Alex:
The reason we chose tokens is primarily because we were seeing prices come down really quickly.

Alex:
Open Router has been around since the beginning of 2023.

Alex:
And I didn't want a model to be penalized in the rankings just because the prices

Alex:
are going down really dramatically now like there's a,

Alex:
There's a paradox called Jevons paradox, which is that when prices decrease like 10x,

Alex:
users' use of some component of infrastructure increases by more than 10x.

Alex:
And so maybe they didn't get 10x at all.

Alex:
But I thought there were some other advantages to using tokens,

Alex:
too. Tokens don't have this penalty and don't rely on Jevon's Paradox,

Alex:
which can have a lot of lag.

Alex:
They also are a little bit of a proxy for time.

Alex:
A model that is generating a lot of tokens and doing so for a while across a lot of users.

Alex:
It means that a lot of people are reading those tokens and actually doing something with them.

Alex:
And same goes for input. But if I really want to send an enormous number of

Alex:
documents and the model has a really, really, really tiny prompt pricing,

Alex:
I think that's still valuable and something that we want to see.

Alex:
We want to see that this model is processing an enormous number of documents.

Alex:
That's a use case that should show up in the rankings.

Alex:
And so we decided to go with tokens. We might like add dollars in the future,

Alex:
but I think tokens are, you know, they don't have this like Jevons Paradox lag.

Alex:
And there wasn't anything else. Like nobody was doing any kind of like overall analytics.

Alex:
We didn't see any other company even do it until Google did a few months ago

Alex:
where they started publishing the total amount of tokens processed by Gemini.

Alex:
So we'll see which use cases really need dollars.

Alex:
But tokens have been holding up pretty well.

Ejaaz:
Yeah, I mean, this dashboard is awesome. And I recommend anyone that's listening

Ejaaz:
to this that can't see our screen to get on OpenRouter's website and check it out.

Ejaaz:
I've been following it for the last two weeks kind of pretty rigorously, Alex.

Ejaaz:
And what I love is you can literally see...

Ejaaz:
So two weeks ago Grok 4 got released right

Ejaaz:
and Josh and I were making a ton of videos on this we were

Ejaaz:
using it with pretty much everything that we could do and

Ejaaz:
then this other model came out of China pretty much a few days after called

Ejaaz:
Kimi K2 and I was like oh yeah whatever this is just some random Chinese model

Ejaaz:
I'm not going to focus on it and then I kept seeing it in my feed and I thought

Ejaaz:
okay maybe I'll give this a go and I kind of like went straight to open rather than just

Ejaaz:
almost gauge the interest from a wider set of AI users. And I saw that it was skyrocketing, right?

Ejaaz:
And then I saw that Quen dropped their models last week.

Ejaaz:
And again, I came to Open Router and it preceded the trend, right?

Ejaaz:
People had already started using it. So I love how you describe Open Router

Ejaaz:
as this kind of like prophetic orb,

Ejaaz:
basically, where the enthusiasts and the community itself can kind of like front

Ejaaz:
run very popular trends. And I think that's a very powerful moat.

Ejaaz:
And kind of on this path, Alex, I noticed that a lot of these major model providers

Ejaaz:
see the value in this, right?

Ejaaz:
So if I'm not mistaken, OpenAI kind of like used your platform to kind of secretly

Ejaaz:
launch their Frontier model before they officially launched it, right?

Ejaaz:
Can you walk us through, you know, how that comes about and more importantly,

Ejaaz:
why they want to do that and why they chose OpenRoddy to do that?

Alex:
Uh open ai will sometimes

Alex:
give uh early access

Alex:
to their to models to some of their customers for

Alex:
testing and we asked them if they

Alex:
wanted to try a stealth model with us which we had never done before um it involved

Alex:
like launching it as under another name and seeing how users respond to it without

Alex:
having any bias or sort of inclination for against the model at the onset.

Alex:
And it would be like a new way of testing it and a new way of...

Alex:
It was like an experiment for both us and them.

Alex:
And they generously decided to take the leap of faith and try it. And we...

Alex:
Launched gpt 4.1 with

Alex:
them at and we called it quasar alpha and

Alex:
it was a million uh

Alex:
token context length model opening us first very

Alex:
very long context model and it was also optimized

Alex:
for coding and the incredible

Alex:
there were a couple incredible things that happened first

Alex:
we have this community uh of benchmarkers

Alex:
that run open source benchmarks and we give

Alex:
a lot of them grants to help fund the benchmarks

Alex:
grants of open router tokens they'll just run the

Alex:
suite of tests against all the models and some of them are very creative like

Alex:
there's one that tests uh like the ability to generate fiction there's one that

Alex:
tests um like how like whether it can make a 3d object project in Minecraft called MCBench.

Alex:
There are a few that test different types of coding proficiency.

Alex:
There's one that just focuses on how good it is at Ruby, because Ruby is,

Alex:
turns out a lot of the models are not great at Ruby.

Alex:
There are a lot of like languages that all the models are pretty bad at.

Alex:
And so we have this like long tail of very niche benchmarks,

Alex:
And all the benchmarkers ran, you know, for free their benchmarks on Quasar

Alex:
Alpha and found pretty incredible results for most of them.

Alex:
And so the model got like, you know, OpenAI got this feedback in real time.

Alex:
We kind of like helped them find it.

Alex:
And they made another snapshot, which we launched as Optimus Alpha.

Alex:
And they could compare the feedback that they got from the two snapshots.

Alex:
Um, and, and then they, and then like two weeks later, they launched GPT 4.1 live for everybody.

Alex:
So it was like, uh, uh, was it an experiment for us?

Alex:
And, and we've done it, um, again since, uh, with, uh, another model provider

Alex:
that, uh, that's still working on it.

Alex:
Um, and it, and it's kind of like a cool way of learning of like crowdsourcing,

Alex:
uh, benchmarks that you wouldn't have expected. and also getting unbiased community sentiment.

Josh:
That's great. So now when we see a new model pop up and we want to test GPT-5,

Josh:
we know where to come to to try it early.

Josh:
We'll see because rumor is it's coming soon. So we'll be, we're on your watch list.

Josh:
But having, I do want to ask you about open source versus closed source because

Josh:
this has been an important thing for us. We talk about this a lot.

Josh:
You have a ton of data on this.

Josh:
I'm looking at the leaderboards there. There are open source models that are

Josh:
doing very well, closed source.

Josh:
What are your takes in general? How do you feel about open source versus closed

Josh:
source models, particularly around how you serve them to users?

Alex:
Both models, both types of models have supply problems, but the supply problems are very different.

Alex:
Typically, what we see with closed source models is that there's there's very

Alex:
few suppliers, usually just one or two.

Alex:
Like with Grok, for example, there's Grok Direct and there's Azure.

Alex:
Um with anthropic there's anthropic direct there's google vertex there's aws

Alex:
bedrock um and then we also like deploy it in different regions like we have

Alex:
an eu deployment um for customers who'd like only want their data like to stay in the eu,

Alex:
and uh and we do custom deployments for

Alex:
the for the closed source models too to just kind of guarantee good

Alex:
throughput high and high rate limits for people um

Alex:
but uh the

Alex:
like a tricky part is that like the the demand usually the like the closed source

Alex:
malls are doing most of the tokens on open router um it's it's dominant you

Alex:
know it's probably 80-ish 70 to 80 percent closed source tokens today.

Alex:
But the open source models have a much more fragmented supply, like cell supply.

Alex:
Side order book um and and like

Alex:
the rate limits for each provider is

Alex:
like a like less stable on average um it

Alex:
usually takes a while for the hyperscalers to serve a

Alex:
new closed source a new open source model um so we so the load balancing work

Alex:
that we do on um open source models tends to be a lot more valuable the load

Alex:
balancing work that we do for closed source models tends to be very focused

Alex:
on caching and feature awareness,

Alex:
making sure you're getting clean cache hits and only transitioning over to new

Alex:
providers when your cache is expired.

Alex:
For open source models, there's way less caching. Very, very few open source

Alex:
models implement caching.

Alex:
And so switching between providers becomes more common. and

Alex:
uh like we we also track a

Alex:
lot of quality differences between the the open

Alex:
source providers some of them will deploy at lower

Alex:
quantization levels which means like it's kind of like a way of compressing

Alex:
the model um generally doesn't have an impact on the quality of the output uh

Alex:
but and yet we still see some odd things from some of the open source providers.

Alex:
And so we run tests internally to detect those outputs. And we're building up

Alex:
a lot more muscle here soon.

Alex:
So that like, they get pulled out of the routing lane and don't affect anyone.

Josh:
So closed source accounts for 80% or something like that, a very large amount.

Josh:
Do you see that changing?

Josh:
Because that post we just had, it's at nine out of the 10 fastest growing LLMs

Josh:
last week, they were open source.

Josh:
And every time it seems like China comes out with another model,

Josh:
it was Kimmy K2 a week or two ago, it kind of really pushes the frontier of open source forward.

Josh:
And the rate of acceleration of open source seems to be as fast,

Josh:
if not faster than closed source, where it's just, it's making these improvements very quickly.

Josh:
It has the benefit of being able to compound in speed because it's open source

Josh:
and everyone can contribute.

Josh:
Do you think that starts to change where the percentage of tokens you're issuing

Josh:
are from open source models versus closed source?

Josh:
Or do you continue to see a trend where it's going to be Google,

Josh:
it's going to be OpenAI that are serving a majority of these tokens to users?

Alex:
In the short term, we're likely to see open source models continue to dominate

Alex:
the fastest growing model category on OpenRouter.

Alex:
And the reason for that is that a lot of users who come for a closed source

Alex:
model, but then decide they want to optimize later,

Alex:
either they want to save on costs or try out a new model that's supposed to

Alex:
be a little bit better in some direction that their app cares about or their use case cares about,

Alex:
then they leave the closed source model and go to an open source model.

Alex:
So open source tends to be like a last mile optimization thing,

Alex:
making a big generalization because the reverse can happen too.

Alex:
And so because it's a last mile optimization thing,

Alex:
the jump from this model is not being used at all to this model is really being

Alex:
used by a couple of people who have

Alex:
left Claude 4 and want to try some new coding use case will be bigger.

Alex:
Than the closed-source models, which start at a really high base and don't have

Alex:
growth quite as dramatic.

Alex:
So the other part of your question, though, was whether there's going to be like a flippening of.

Josh:
Close or some sort of like chipping it away at that monopoly of close source tokens.

Alex:
It's hard to predict these things because, you know,

Alex:
I think like the the biggest problem today with open source models is that the

Alex:
incentives are not as strong like the model lab and the model provider.

Alex:
Um they've you know they're they're

Alex:
sort of established incentives for how to

Alex:
grow as a company and attract good high quality um ai talent and um and giving

Alex:
the model weights away impairs those incentives now like we might see yeah this

Alex:
is where we might see like decentralized providers,

Alex:
helping in the future.

Alex:
A way for like,

Alex:
uh you know like a really good incentive scheme that

Alex:
like allows high quality talent

Alex:
to work on an open source model um

Alex:
that remains open weights at least uh like could fix this i like i you know

Alex:
i stay pretty i try to stay close to the decentralized providers um and like

Alex:
learn a lot from them there's some like cool on the provider side on like on

Alex:
running inference i I think there's some really cool incentive schemes being worked on.

Alex:
But on actually developing the models themselves, I haven't seen too much, unfortunately.

Alex:
So I think if we see one, flipping in the radar. And until we do, I personally doubt it.

Josh:
TBD, do you have personal takes on how you feel about open source versus closed source?

Josh:
Because this has been a huge topic we've been debating too. It's just the ethical

Josh:
concerns around alignment and closed source models versus open source.

Josh:
When you look at the competitors, China, generally speaking,

Josh:
is associated with open source, whereas the United States is generally associated with closed source.

Josh:
And we saw Llama and Meta release the open source models, but now they're raising

Josh:
a ton of money to pay a lot of employees a lot of money to probably develop a closed source model.

Josh:
So it seems like the trends are kind of split between US and China.

Josh:
And I'm curious if you have any personal takes, even outside of OpenRouter,

Josh:
of which you think serves better for the long term outlook on,

Josh:
I mean, the position of the United States or just the general safety and alignment

Josh:
conversation around AI?

Alex:
I mean, like a very simple fundamental difference between the two is that an

Alex:
innovation in open source models can be copied more quickly than an innovation

Alex:
in closed source models.

Alex:
So in terms of velocity and like how far ahead one is over the other,

Alex:
that is like a massive structural difference.

Alex:
That means that closed source models should be theoretically always ahead until

Alex:
a really interesting incentive scheme develops, like I mentioned before.

Alex:
Uh, I think, and I think that's, you know, I don't see like evidence that that's

Alex:
going to change in terms of China versus the U S.

Alex:
Um, it's, I think it's very interesting that China has not had like a major closed source model.

Alex:
Um and i don't really

Alex:
see a great reason why i'm

Alex:
not aware of any reasons that's not that's not going

Alex:
to be going to be the case in the future um my prediction

Alex:
is that there's going to be a closed source model from china um

Alex:
and uh you know if uh uh you know if like it's possible that DeepSeas and Moonshot

Alex:
and Gwen have built up really sticky talent pools.

Alex:
But generally with talent pools, after enough years have passed,

Alex:
people quit and go and create new companies and build new talent pools.

Alex:
And so we should see some of that. It's not the case that the AI space has NDAs

Alex:
or non-competes that the hedge fund space has.

Alex:
That might happen in the future too. But assuming that the current non-compete

Alex:
culture continues, there should be more companies that pop up in China over time.

Alex:
And I'm betting that some of them will be closed source.

Alex:
And my guess is that the two nations will start to look more similar.

Ejaaz:
Yeah, I guess that's why you have Zuck dishing out 300 mil to a billion dollar

Ejaaz:
salary offers to a bunch of these guys, right?

Ejaaz:
One more question on China versus the US. I kind of agree with you.

Ejaaz:
I didn't really expect China to be the one to lead open source anything,

Ejaaz:
let alone the most important technology of our time.

Ejaaz:
Do you think is their secret source to building these models, Alex?

Ejaaz:
And I know this might be out of the forte of

Ejaaz:
open router specifically but as someone who has studied this technology for

Ejaaz:
a while now i'm struggling to figure out you know what advantage they had you

Ejaaz:
know they're discovering all these new techniques and maybe the simple answer

Ejaaz:
is like constraints right they don't have access to all of

Ejaaz:
nvidia's chips they don't have access to infinite compute so then maybe they're

Ejaaz:
forced to kind of like figure out other ways around the same kinds of problems

Ejaaz:
that western companies are focused on But it's pretty clear that America, with all its funding,

Ejaaz:
hasn't been able to make these frontier breakthroughs.

Ejaaz:
So I'm curious whether you are aware of or know some kind of technical moat

Ejaaz:
that Chinese AI researchers or these AI teams that are featuring on Open Rata

Ejaaz:
day in and day out have over the U.S.?

Alex:
Well, I don't know.

Alex:
There are certainly some that they've come up with that like DeepSeek had a

Alex:
lot of very cool inference innovations that they published in their paper.

Alex:
But a lot of what they published in the original R1 paper were things that OpenAI

Alex:
had done independently themselves many months before.

Alex:
So uh i like

Alex:
on the inference side and on

Alex:
uh some of the model side i think like deep seek we we

Alex:
had talked to their team for years before r1 came

Alex:
out they had many models before that and

Alex:
they were always like a pretty sharp optimum like

Alex:
team for doing inference um like they

Alex:
came up with like the best user experience for caching prompts

Alex:
long before deep cpr1 came out and they had very good pricing um they uh they

Alex:
were just they were like you know by far the the strongest chinese team um that

Alex:
we were aware of uh well before that happened and so i'm guessing there was like some talent.

Alex:
Uh accumulation that they were working on in china

Alex:
for people who wanted to stay in china and yeah that's

Alex:
that's a huge advantage like american companies are obviously not

Alex:
doing that there's a duck is very on

Alex:
point that a lot of this is just based on talent

Alex:
um there are a lot of

Alex:
ai is open and out there and just like and

Alex:
very composable like a big tree of knowledge

Alex:
there's a paper that comes out and it cites like

Alex:
20 other papers and you can go and read all

Alex:
of the cited papers and then you like have kind of

Alex:
a basis for understanding the paper but you really have to

Alex:
go one level deeper and read all the cited papers two levels

Alex:
down to really understand what's going on and it's.

Alex:
Just that no very few people can do that um and

Alex:
it takes like a lot of years of experience to like actually

Alex:
apply that knowledge and learn all these

Alex:
things that have not been written in any paper at all and uh

Alex:
and there's just there's just such such it

Alex:
like a small number of people um who can

Alex:
really lead research on all the different dimensions that

Alex:
go on to making a model and uh um and

Alex:
and like the the border between china and the u.s is

Alex:
is pretty defined you have to leave china move to the u.s

Alex:
and really establish yourself here um so

Alex:
i do think there's like country arbitrage there's like

Alex:
there's you know the head the hedge fund background arbitrage there's uh there's

Alex:
there's hardware arbitrage like there's like a ton of hardware that's only available

Alex:
in china but not here vice versa that creates an opportunity um and this this

Alex:
will just continue to happen.

Ejaaz:
Yeah, I think this arbitrage is fascinating.

Ejaaz:
I read somewhere that there's probably less than 200 or 250 researchers in the

Ejaaz:
world that are worthy of working at some of these frontier AI model labs.

Ejaaz:
And I looked into some of the backgrounds of the team behind Kimi K2,

Ejaaz:
which is this recent open source model out of China, which broke all these crazy rankings.

Ejaaz:
I think it was like a trillion parameter model or something crazy like that.

Ejaaz:
And a lot of them worked at some of the top American tech companies.

Ejaaz:
And they all graduated from this one university in China.

Ejaaz:
I think it's Tsinghua, which apparently is like, you know, the Harvard of AI

Ejaaz:
in China, right? So pretty crazy.

Ejaaz:
But Alex, I wanted to shift the focus of the conversation to a point that you

Ejaaz:
brought up earlier in this episode, which is around data.

Ejaaz:
Okay, so here's the context that like Josh and I have spoken about this at length, right?

Ejaaz:
We are obsessed with this feature on OpenAI, which is memory, right?

Ejaaz:
And I know a lot of the other memory, sorry, a lot of the other AI models have memory as well.

Ejaaz:
But the reason why we love it so much is I feel like the model knows me, Alex.

Ejaaz:
I feel like it knows everything about me. It can personally curate any of my prompt.

Ejaaz:
It just gets me. It knows what I want and it just serves up to me in a platter

Ejaaz:
and off I go, you know, doing my thing.

Ejaaz:
Now, Open Router sits on top of like kind of like the query layer, right?

Ejaaz:
So you have all these people writing all these weird and wonderful prompts and

Ejaaz:
kind of routing it through on towards like different AI models.

Ejaaz:
You hold all of that data or maybe you have access to all of that data.

Ejaaz:
And I know you have something called private chat as well, where you don't have access to it.

Ejaaz:
Talk to me about like what OpenRouter and what you guys are thinking about doing

Ejaaz:
with this data, because presumably,

Ejaaz:
or in my opinion, you guys have actually the best mode, arguably better than

Ejaaz:
ChatGPT, because you have all these different types of prompts coming from all

Ejaaz:
these different types of users for all these different types of models.

Ejaaz:
So theoretically, you could spin up some of the most personal AI models for

Ejaaz:
each individual user if you wanted to.

Ejaaz:
Do I have that correct? Or am I, you know, speaking crazy?

Alex:
No, that's true. No, it's something we're thinking about.

Alex:
By default, your prompts are not logged at all.

Alex:
We don't have prompts or completions for new users by default.

Alex:
You have to toggle it on in settings.

Alex:
But the result, a lot of people do toggle it on. And as a result,

Alex:
I think we have by far the largest multi-model prompt data set.

Alex:
Uh, but what we've done today, we've barely done anything with it.

Alex:
We classify a tiny, tiny, tiny subset of it. And that's what you see in the rankings page.

Alex:
Um, but, uh, what it could be done on like a per account level is really,

Alex:
um, like three main things.

Alex:
One memory right out of the box. You can, you can get this today by like combining

Alex:
open router with like a memory as a service. We've got a couple of companies

Alex:
that do this, like Memzero and SuperMemory.

Alex:
And we can partner with one of those companies or do something similar and just

Alex:
provide a lot of distribution.

Alex:
And that basically gets you a chat GPT as a service where it feels like the

Alex:
model really knows you and the right context gets added to your prompt.

Alex:
The other things that we can do are help you select the right model more intelligently.

Alex:
There's a lot of models where there's like a super clear, like migration decision that needs to be made.

Alex:
And, and we can just see this very clearly in the data.

Alex:
But we right now we just like, you know, we have like a channel or like some

Alex:
kind of communication channel open with the customer, we can just tell them

Alex:
like, hey, and we know you're using this model a ton.

Alex:
It's been deprecated. This model is significantly better. you

Alex:
should move this kind of workload over to it or like

Alex:
this workload you'll get way better pricing if you do this um

Alex:
and and that's basically like that's the

Alex:
only sort of guidance and kind of like

Alex:
opinionated routing we've done so far and it could

Alex:
be a lot more intelligent a lot more out of the box a lot more

Alex:
built into the product um and then

Alex:
the the last thing

Alex:
we can do i mean there's there's probably tons of

Alex:
things we're not even thinking about um but

Alex:
like getting really

Alex:
really smart about how

Alex:
models and providers are responding to prompts and

Alex:
uh showing you just the really coolest

Alex:
data just like telling you

Alex:
what kinds of of prompts um are

Alex:
going to which models and how those models are replying and

Alex:
just like characterizing the reply in all kinds of interesting ways

Alex:
like did the model refuse to answer what's the refusal rate

Alex:
did the model um did the.

Alex:
Model like successfully make a tool call or did it decide to

Alex:
ignore all the tools that you passed in that's a huge one

Alex:
um did the model like pay

Alex:
attention to its context did uh you know did what did did some kind of truncation

Alex:
happening happen before you sent it to the model So there's all kinds of like

Alex:
edge cases that cause developers apps to just get dumber and they're all detectable.

Ejaaz:
I'm so happy you said that because I have this kind of like hot take,

Ejaaz:
but maybe not so hot take, which is I actually think all the Frontier models

Ejaaz:
right now are good enough to do the craziest stuff ever for each user.

Ejaaz:
But we just haven't been able to unlock it because it just doesn't have the context.

Ejaaz:
Sure, you can attach it to a bunch of different tools and stuff,

Ejaaz:
but if it doesn't know when to use the tool or how to process a certain prompt

Ejaaz:
or if the users themselves don't know how to read

Ejaaz:
the output of the AI model themselves, like you just said, we need some kind

Ejaaz:
of analytics into all of this,

Ejaaz:
then we're just kind of walking around like headless chickens almost.

Ejaaz:
So I'm really happy that you said that. One other thing that I wanted to get

Ejaaz:
your take on on the data side of things is, I just think this whole concept

Ejaaz:
or notion of AI agents is becoming such a big trend, Alex.

Ejaaz:
And I noticed a lot of Frontier Model Labs release new models that kind of spin

Ejaaz:
up several instances of their AI model.

Ejaaz:
And they're tasked with a specific role, right?

Ejaaz:
Okay, you're going to do the research. You're going to do the orchestrating.

Ejaaz:
You're going to look online via a browser, blah, blah, blah,

Ejaaz:
blah, blah. And then they coalesce together at the end of that little search

Ejaaz:
and refine their answer and then present it to someone, right?

Ejaaz:
You know, Grok4 does this, Claude does this, and a few other models.

Ejaaz:
I feel like with this data that you're describing, OpenRouter could be or could

Ejaaz:
offer that as a feature, right?

Ejaaz:
Which is essentially, you can now have super intuitive, context-rich agents

Ejaaz:
that can do a lot more than just talk to you or answer your prompts.

Ejaaz:
But they could probably do a bunch of other actions for you.

Ejaaz:
Is that a fair take, or is that something that maybe might be out of the realm of open router?

Alex:
Our strategy is to be the best inference layer for agents.

Alex:
And what I think developers want...

Alex:
Is control over how their agents work.

Alex:
And our developers at least want to use us as a single pane of glass for doing

Alex:
inference, but they want to see and control the way an agent looks.

Alex:
An agent is basically just something

Alex:
that is doing inference in a loop and controlling the direction it goes.

Alex:
So um what what

Alex:
we want to do is just like build incredible docs

Alex:
really good primitives that make that easy

Alex:
to do so that you know like i think like

Alex:
a lot of our developers are just people building agents and so

Alex:
what they want is they want the primitives to

Alex:
be solved so that they can just keep creating new

Alex:
versions and new ideas um without worrying

Alex:
about like you know re-implementing tool calling over

Alex:
and over again and um and and

Alex:
and so like at least for this is like a it's it's

Alex:
a tough problem given how many models there's like a new model or provider every

Alex:
day and uh and people actually want them and use them so uh to standardize this

Alex:
like make make these tools like really dependable um that's kind of like where

Alex:
we want to focus and uh so that like agent developers don't have to worry about it.

Josh:
As we level up towards closer and closer to getting to AGI beyond,

Josh:
I'm curious what Open Router's kind of endgame is.

Josh:
If you have one, what is the master plan where you hope to end up?

Josh:
Because the assumption is as these systems get more intelligent,

Josh:
as they're able to kind of make their own decisions and choose their own tool

Josh:
sets, what role does Open Router play in continuing to route that data through?

Josh:
Do you have a kind of master plan, a grand vision of where you see this all heading to?

Alex:
You're saying like as agents get better at choosing the tools that they use

Alex:
what what becomes our role when like the agents are really good at that yes.

Josh:
Yes and like where do you see open router fitting into the picture and what

Josh:
would be the best case scenario for this this future of open router

Alex:
Right now open routers bring your own tool,

Alex:
platform um we don't have like a

Alex:
marketplace of mcps yet uh and

Alex:
and i i do think like a lot of the i think most of the most used tools will

Alex:
be ones that developers configure themselves agents just work like they're given

Alex:
access to it like i think like a holy grail for for open router is that.

Alex:
The the ecosystem is going to like basically my

Alex:
prediction for how the ecosystem is going to evolve is that um

Alex:
all the models are going to be adding state and

Alex:
other kinds of stickiness that just make you want to stick

Alex:
with them so they're going to add server-side tool calls

Alex:
they're going to add like um you know web search that that is stateful they're

Alex:
going to add memory They're going to add all kinds of things that try to prevent

Alex:
developers from leaving and increase lock-in.

Alex:
And OpenRouter is doing the opposite.

Alex:
We want developers to not feel vendor lock-in.

Alex:
We want them to feel like they have choice and they can use the best intelligence,

Alex:
even if they didn't before.

Alex:
It's never too late to switch to a more intelligent model. That would be like,

Alex:
you know, a good always on outcome for us.

Alex:
And so what I think we'll end up doing is, is like partnering with other companies

Alex:
or building the tools ourselves if we have to, so that developers don't feel stuck.

Alex:
That's how I, you know, there's a lot of ways the ecosystem could evolve,

Alex:
but that's how I would put it in a nutshell.

Josh:
Okay, now there's another personal question that I was really curious about,

Josh:
because I was also right there with you in the crypto cycle when NFTs got absolutely

Josh:
huge, was a big user of OpenSea.

Josh:
And it was kind of this trend that went up and then went down.

Josh:
And NFTs kind of fizzled out, it wasn't as hot anymore, and AI kind of took the wind from the sails.

Josh:
And it's a completely separate audience, but a similar thing where now it's

Josh:
the hottest thing in the world.

Josh:
And i'm curious how you see the trend continuing is this a cyclical thing that

Josh:
has ups and downs or is this a one-way trajectory of more tokens every day more

Josh:
ai every day is do you see it being a cyclical thing or is this a a one-way

Josh:
trend towards up into the right nfts

Alex:
Kind of follow uh crypto in a,

Alex:
indirect way um when crypto

Alex:
has ups and downs nfts generally lag a bit

Alex:
but they they have similar ups and downs and um

Alex:
and crypto is an extremely long-term play on like building a new financial system

Alex:
and there are so many reasons that it's not going to happen overnight um and And they're like,

Alex:
it's very, very entrenched reasons.

Alex:
Whereas AI, there are some overnight business transformations going on.

Alex:
And the reason AI, I think, moves a lot, one of the reasons that AI moves a

Alex:
lot faster is it's just about making computers behave more like humans.

Alex:
So if a company already works with a bunch of humans, then there's,

Alex:
you know, there's some engineering that needs to be done.

Alex:
There's some like thinking about how

Alex:
to like scale this but

Alex:
but in general i think that it's not like

Alex:
after seeing what can be possible um inference

Alex:
will be the fastest growing operating expense for all companies

Alex:
it'll it'll be like oh we can just hire

Alex:
high-performing employees at a click of a

Alex:
and they they work 24 7 they

Alex:
scale elastically it's like you know

Alex:
it it's not that hard it's not like huge mental

Alex:
model shift it's just like a huge upgrade to the way companies work today um

Alex:
in most cases so it's just completely different from crypto there's there's

Alex:
like other than both being you know than nfts i mean other than both being new

Alex:
they're fundamentally very different changes.

Ejaaz:
You're probably one of very few people in the world right now that has crazy

Ejaaz:
insights to every single AI model.

Ejaaz:
Definitely more than the average user, right? Like I have like three or four

Ejaaz:
subscriptions right now and I think I'm a hotshot.

Ejaaz:
You get access to like 400 and what is it? 57 models right now on OpenRata.

Ejaaz:
So an obvious question that I have for you is

Ejaaz:
I'm not going to say in the next couple of years, because everything moves way

Ejaaz:
too quickly in this sector.

Ejaaz:
But over the next six months, is there anything really obvious to you that should

Ejaaz:
be focused on within the AI sector?

Ejaaz:
Maybe it's like the way that certain models should be designed,

Ejaaz:
or perhaps it's at the application layer that no one's talking about right now.

Ejaaz:
Because going on from our earlier part of the conversation, you just pick these

Ejaaz:
trends out really early. and I'm wondering if you see anything.

Ejaaz:
It doesn't have to be open-racket related. It could just be AI related.

Alex:
I've seen the models trending towards caring more about how resourceful they

Alex:
are than what knowledge they have in the bank.

Alex:
Not all of, I feel like a lot of the applications, I think the model labs maybe,

Alex:
a lot of them, I don't know how many of them really deeply believe that,

Alex:
but a couple of them uh talk about it and i don't think it's really hit the

Alex:
application space yet um because people will will ask chat gpt things and if

Alex:
the knowledge is wrong they think the model is stupid,

Alex:
and that's just kind of a bad way of evaluating a model um

Alex:
like whatever knowledge a person has whatever

Alex:
a person like where calls happen at a certain time like

Alex:
does not it's not a proxy for how smart they are um

Alex:
like the the intelligence and usefulness of a model

Alex:
is going to trend towards how good it is at using tools and

Alex:
uh and and how good it is at like paying

Alex:
attention to its context of a long long long long context and so it's like it's

Alex:
it's total memory capacity and accuracy um so i think those two things need

Alex:
to be like emphasized more um the.

Alex:
Like it might be that that models pull all

Alex:
of their knowledge from like online databases

Alex:
from like real-time uh scraped

Alex:
index indices of the web along with a

Alex:
ton of real-time updating data sources um and

Alex:
they're never they're always kind of like relying on some some sort of database

Alex:
for knowledge but relying on their reasoning process for for tool calling you

Alex:
know like we we put it We spend probably the plurality of our time every week

Alex:
on tool calling and figuring out how to make it work really well.

Alex:
Humans, the big difference between us and animals is that we're tool users and tool builders.

Alex:
And that's where human acceleration and innovation has happened.

Alex:
So how do we get models creating tools and using tools very,

Alex:
very effectively? there's very little,

Alex:
There are very few benchmarks. There's very little priority.

Alex:
There's the Tau Bench for measuring how good a model is at tool calling.

Alex:
But there's, and there's like maybe a few others.

Alex:
There's Swee Bench for measuring how good a model is at multi-turn programming tasks.

Alex:
It's very, very hard to run, though. It costs like, you know,

Alex:
for Sonnet, it could cost like $1,000 to run it.

Alex:
And it's like the user experience for kind of like evaluating the real intelligence

Alex:
of these models is not good.

Alex:
And so like I love, as much as we don't have benchmarks listed on OpenRouter

Alex:
today, I love benchmarks.

Alex:
And I think like the app ecosystem and like developer ecosystem should spend

Alex:
a lot more time making very cool and interesting ones.

Alex:
Also, we will give credit grants for all the best ones. So I highly encourage it.

Ejaaz:
Well, Alex, thank you for your time today. I think we're coming up on a close

Ejaaz:
now. That was a fascinating conversation, man.

Ejaaz:
And I think your entire journey from just non-AI stuff, so OpenSea all the way

Ejaaz:
to OpenRouter has just been a great indicator of where these technologies are

Ejaaz:
progressing and more importantly, where we're going to end up.

Ejaaz:
I'm incredibly excited to see where OpenRatter goes beyond just prompt routing.

Ejaaz:
I think some of the stuff you spoke about on the data side of things is going

Ejaaz:
to be fascinating and arguably one of your bigger features. So I'm excited for future releases.

Ejaaz:
And as Josh said earlier, if GPT-5 is releasing through your platform first,

Ejaaz:
please give us some credits. We would love to use it.

Ejaaz:
But for the listeners of this show, as you know, we're trying to bring on the

Ejaaz:
most interesting people to chat about AI and Frontier Tech. We hope you enjoyed this episode.

Ejaaz:
And as always, please like, subscribe, and share it with any of your friends

Ejaaz:
who would find this interesting. And we'll see you on the next one. Thanks, folks.

OpenRouter: The Only AI Tool You'll Ever Need | Founder Alex Atallah
Broadcast by