OpenRouter: The Only AI Tool You'll Ever Need | Founder Alex Atallah
Ejaaz:
What if I told you there was a single website you could go to where you can
Ejaaz:
chat to any major AI model from one single interface?
Ejaaz:
It's kind of like chat GPT, but instead every prompt gets routed to the exact
Ejaaz:
AI model that will do the best job for whatever your prompt might be.
Ejaaz:
Well, on today's episode, we're joined by Alex Atala, the founder and CEO of Open Router AI.
Ejaaz:
It's the fastest growing AI model marketplace with access to over 400 LLMs,
Ejaaz:
making it the only place that really knows how people use AI models,
Ejaaz:
and more importantly, how they might use them in the future.
Ejaaz:
It's at the intersection of every single prompt that anyone writes and every
Ejaaz:
model that they might ever be.
Ejaaz:
Alex Atala, welcome to the show. How are you, man?
Alex:
Thanks, guys. Great. Thanks so much for having me on.
Ejaaz:
So it is a Monday. How does the founder of OpenRouter spend his weekend?
Ejaaz:
Presumably you know out and about chilling relaxing not at all focused on the company oh
Alex:
I usually i love weekends with no
Alex:
meetings planned and i just go to a coffee shop and just have tons of hours
Alex:
stacked in a row uh to do things that require a lot of momentum build up so
Alex:
i did that at coffee shops on saturday and sunday and then i watched blade runner again.
Ejaaz:
Again okay um well so
Ejaaz:
when we were preparing for this episode alex
Ejaaz:
um i couldn't help but think that you've had a pretty insane decade of startup
Ejaaz:
foundership right um so open router is kind of like your second major thing
Ejaaz:
that you've done but prior to doing that you were the founder and cto of OpenSea,
Ejaaz:
the biggest NFT marketplace out there.
Ejaaz:
And now you're focused on one of the biggest AI companies out there.
Ejaaz:
So it sounds like you're at kind of like the pivot point of two of the most
Ejaaz:
important technology sectors over the last decade.
Ejaaz:
Can you just give us a bit of background as to how you ended up here?
Ejaaz:
And more importantly, where you started.
Ejaaz:
Walk us through the journey of OpenSea and how you ended up at OpenRouter AI.
Alex:
Yeah, so I co-founded OpenSea with Devin Finzer the very beginning of 2018, very end of 2017.
Alex:
It was the first NFT marketplace. And...
Alex:
It was not dissimilar to OpenRouter in that there was a really fragmented ecosystem
Alex:
of NFT metadata and media that gets attached to these tokens.
Alex:
And it was the first example of something in crypto that could be non-fungible,
Alex:
meaning it's a single thing that can be traded from person to person.
Alex:
Most things in the world are non-fungible. A chair is non-fungible. a
Alex:
currency is fungible so it was
Alex:
back back in 2018 no
Alex:
one was really thinking about crypto in terms of non-fungible goods
Alex:
and uh and the problem with the non with
Alex:
non-fungible goods is that there weren't any real standards set up
Alex:
um there was a lot of heterogeneous like
Alex:
implementations for how to get uh like
Alex:
a non-fungible item represented and tradable in a decentralized way So OpenSea
Alex:
organized this like very heterogeneous inventory and put it together in one
Alex:
place. We came up with like a metadata standard.
Alex:
We did a lot of like a lot of work to really make the experience super good for each collection.
Alex:
And you see a lot of those a lot of similarities with how AI works today,
Alex:
too, where there's also just a very heterogeneous ecosystem.
Alex:
On a lot of different APIs and different features supported by language model providers.
Alex:
And Open Router similarly does a lot of work to organize it all.
Alex:
Um i was at open sea uh until 2022 um when i was kind of feeling the itch to do something new.
Alex:
And um i'm at the very end of i left in august and then chat gpt came out a few months later.
Alex:
And uh and my biggest question around that
Alex:
time was whether it was going to be a winner take all market
Alex:
because opening i was very far ahead of
Alex:
everybody else and um you know
Alex:
we had cohere command we had a couple open source models um
Alex:
but opening i was the only really usable one i
Alex:
was doing little projects to experiment
Alex:
with the gpt3 api and uh
Alex:
and then llama came out in january um really
Alex:
exciting about a tenth the size one on a
Alex:
couple benchmarks but it wasn't really chattable yet and
Alex:
uh and it wasn't until uh a few
Alex:
months later that somebody a team at
Alex:
stanford distilled it into a new
Alex:
model called alpaca um distillation means
Alex:
you you take the model and you customize it or fine tune it
Alex:
on a set of synthetic data that they
Alex:
made using chat gpt as a research project
Alex:
and uh and that was it was
Alex:
the first successful major distillation that i'm aware
Alex:
of um and it was an actually usable model i
Alex:
was like on the airplane talking to him i was like wow this
Alex:
is if it only took six hundred dollars to make something like this then you
Alex:
don't need ten million dollars to make a model there might be like tens of thousands
Alex:
hundreds of thousands of models in the future and suddenly this started to look
Alex:
like a new like economic primitive a new building block that people that kind
Alex:
of deserve their own place on the internet.
Alex:
And there wasn't one. There wasn't a place where you could discover new language
Alex:
models and see who uses them and why.
Alex:
And that's how OpenRouter got started.
Josh:
That's amazing. So one of the things that we're obsessed with on this channel
Josh:
in particular is exploring frontiers and how to properly see these frontiers
Josh:
and analyze them and understand when they're going to happen.
Josh:
And when I was going through your history, you have this talent consistently over time.
Josh:
And even as far back as early on, I read you were hacking Wi-Fi routers in a hackathon.
Josh:
You're very early to that. You were early to the NFTs. You were early to understanding
Josh:
AI and the impact that it would have. And what I'd love for you to explain is
Josh:
the thought process and the indicators you look for when exploring these new
Josh:
frontiers, because clearly there's some sort of pattern matching going on.
Josh:
Clearly you have some sort of awareness of what will be important and why it
Josh:
will be important, and then inserting yourself into that narrative.
Josh:
So are there patterns? Are there certain things that you look for when searching
Josh:
for these new opportunities and that led you to make these decisions that you have?
Alex:
I think there's there's a lot to be said for finding enthusiast communities
Alex:
and and seeing if you're going to join it.
Alex:
Like, can you be an enthusiast with them?
Alex:
Like whenever something new comes out that has like some kind of ecosystem potential,
Alex:
there's there are going to be enthusiast communities that pop up.
Alex:
And the Internet has made it self-certain. You could just join the communities.
Alex:
Um discord i think is a incredible
Alex:
and super underrated platform because
Alex:
the communities feel kind of private you're
Alex:
like getting you don't feel like you're you know
Alex:
seeing somebody trying to get s you
Alex:
know like advertise something for seo juice there's
Alex:
no seo juice in discord um it's it's
Alex:
just people talking about what they're passionate about and and it
Alex:
goes it gets really niche um and when
Alex:
you find a like an interest group in discord that
Alex:
like has to do with some some new
Alex:
piece of technology that's just being developed right now and doesn't really
Alex:
work very well at all um you get people who are just trying to figure out what
Alex:
to do with it and how to make it better and i think that's like that's the first
Alex:
core piece of magic that jumps to mind,
Alex:
there's got to be like a willingness to be weird because if you jump into any
Alex:
of these communities at face value it's stupid.
Alex:
Like oh this is like just a game or it's like a really weird game I mean I'm
Alex:
not really a collectible game so I'm going to leave right now and yeah.
Alex:
Not only do you have to be aware, but you have to be creative.
Alex:
Like, okay, these are just cats on the blockchain, and people are just trading cats back and forth.
Alex:
You can't look at the community as simply that.
Alex:
Think about what you could do with it.
Alex:
Like, what is this unlock that wasn't achievable before?
Alex:
Um and uh
Alex:
and and i think there are there are people who
Alex:
just are good who will do this and they'll join the communities
Alex:
and and brainstorm live and you can see everybody
Alex:
brainstorming uh in real time but like
Alex:
another incredible example of this was the mid-journey discord
Alex:
you know it became the
Alex:
biggest biggest server in discord by
Alex:
far uh and you know
Alex:
why did that happened well you could it started with
Alex:
something weird silly maybe
Alex:
not super useful but you could see all the
Alex:
enthusiasts like remixing and
Alex:
brainstorming live how to turn it into something beautiful
Alex:
and how to how to make it useful and um
Alex:
and then you know just explode it like i it's the most it's the it's the most
Alex:
incredible like niche community uh i think that discord has ever seen because
Alex:
of like how useless it started and how insanely exciting it became.
Alex:
So um like i mean i i
Alex:
think i saw big sleep i was like playing around with this model
Alex:
called big sleep in 2021 that uh
Alex:
let you generate images that
Alex:
look kind of like deviant art okay and
Alex:
uh you could see you could like they're all
Alex:
animated images and they none of them really made sense but you could get some
Alex:
really cool stuff not like potentially something you'd want to make your desktop
Alex:
wallpaper and if you're really like deep in some deviant art communities you
Alex:
know you kind appreciate it and so and that that that was like oh there's like
Alex:
a kernel of something here,
Alex:
and uh it took like a like another year or two before mid-journey started to
Alex:
like pick up but that was like.
Ejaaz:
Where were you seeing all of this alex like where were you scouring just random
Ejaaz:
forums or just wherever your nose told you to go
Alex:
But basically there's this twitter account I'm trying to remember what it's
Alex:
called that posts AI research papers and and like kind of tries to show what you can do with them.
Alex:
And I discovered this Twitter account in like 2021.
Alex:
And I.
Alex:
I think it was not it was it wasn't at all like related to crypto but it was
Alex:
a way you know big sleep was like the first thing i saw that used ai to generate
Alex:
things that could potentially be nfts,
Alex:
so i started experimenting around like how how much you could direct it to make
Alex:
an nft collection that would make any sense it was very very difficult um but
Alex:
that was how uh that was like the first generative and.
Ejaaz:
This was before you were even thinking about starting open router right
Alex:
Um yeah yeah this was back this was when i was
Alex:
full-time at openc um oh is
Alex:
yeah i got the it's a colic
Alex:
this twitter account all right
Alex:
i really recommend it they basically post papers and like explainate and explore
Alex:
how this paper gets useful um they post animations uh like they make they make
Alex:
ai research like kind of fun to engage with and that was that was my first experience.
Ejaaz:
Okay, so I mean that's a massive win for X or formerly as it was known back
Ejaaz:
then, Twitter as a platform, right?
Ejaaz:
It gave birth to kind of like two of the biggest technologies crypto,
Ejaaz:
also known as crypto Twitter, and now apparently all the AI research stuff which
Ejaaz:
kind of put you on to the path that led you to OpenRatter.
Ejaaz:
So if I've got this right, Alex, you were full-time at OpenSea with a
Ejaaz:
multi-billion dollar company loads of important stuff to do there,
Ejaaz:
but you still found the time to kind of scour this fringe technology because
Ejaaz:
that's what AI was at the time.
Ejaaz:
Prior to kind of GPT-2 or GPT-3, no one really knew about this.
Ejaaz:
And you were playing around with these gen AI models, these generative AI models
Ejaaz:
that would create this magical little substance and maybe it came in the form
Ejaaz:
of a pitcher or a weird little cat.
Ejaaz:
And you kind of like jumped into these niche forums of enthusiasts,
Ejaaz:
as you say, and kind of explored that further.
Ejaaz:
And it sounds like you kind of like honed that even beyond your journey from OpenSea when you left.
Ejaaz:
I remember actually meeting you in this kind of like this abbess between you
Ejaaz:
leaving OpenSea and starting OpenRouter where you were kind of brainstorming
Ejaaz:
a bunch of these ideas. And I remember a snippet from our conversation
Ejaaz:
In like one of the WeWorks here, where you just kind of like had whiteboarded a bunch of AI stuff.
Ejaaz:
And one of those things was kind of like the whole topic of inference.
Ejaaz:
And if I'm being honest with you, I had no idea what that word even meant back then.
Ejaaz:
I was extremely focused on all the NFT stuff and all the crypto stuff,
Ejaaz:
my background's in all of that.
Ejaaz:
But I just found that fascinating that you always had your nose in some of the
Ejaaz:
early communities. And I think that's a really important lesson there.
Ejaaz:
I want to pick up on something that you actually brought up when you said you
Ejaaz:
discovered kind of like your path to open router, Alex.
Ejaaz:
And that is, you said you were playing around with these early AI models.
Ejaaz:
So not the GPTs before Claude was even created.
Ejaaz:
You're playing around with these random models that you would find either on
Ejaaz:
forums, on Twitter, or on Reddit, right? and you would experiment with them.
Ejaaz:
And I find it fascinating that back then, even when GPT became a thing,
Ejaaz:
you were convinced that there would be hundreds of thousands,
Ejaaz:
or did you say hundreds of thousands of AI models?
Ejaaz:
Back then, that wasn't a normal view.
Ejaaz:
Back then, everyone was like, you need hundreds of millions of dollars.
Ejaaz:
Maybe it was tens of millions of dollars back then. And it was going to be a rich man's game.
Alex:
Yeah, it was basically the Alpaca Project that kind of put me over the sack.
Alex:
On there being many, many, many models instead of just a very small number.
Ejaaz:
And can you explain what the Alpaca project is for the audience? Yeah.
Alex:
So the Alpaca project, after Lama came out, you really could not chat with it
Alex:
very well. It was a text completion model.
Alex:
There were a couple benchmarks where it beat GPT-3.
Alex:
And... It was about a tenth the size of what most people thought GPT-3 was sized at.
Alex:
So it was a pretty incredible achievement.
Alex:
But it wasn't really like, the user experience wasn't there.
Alex:
And the Alpaca project took ChatGPT and generated a bunch of synthetic outputs.
Alex:
And then they fine-tuned Llama on those synthetic outputs.
Alex:
And this did two things to Llama. It taught it style, and it taught it knowledge.
Alex:
It taught it, like, the style is like how to chat, which was the big user experience gap.
Alex:
And it made it smarter.
Alex:
Like, you can, fine-tuning transfers both style and knowledge.
Alex:
And the model would, like, respond to things that it had, you know,
Alex:
like, the content of the synthetic data, like, was reflected in the model's
Alex:
performance on benchmarks after that point.
Alex:
So um so if you can do
Alex:
that without revealing all
Alex:
the data that goes in um now now
Alex:
there's like a way you could sell data via api without
Alex:
like like just dumping all the data out to the world and then never being able
Alex:
to to like monetize it again so there's like a brand new business model around
Alex:
data that emerges um yet like the ability to create just like work towards open intelligence,
Alex:
and uh and build like new
Alex:
architectures test them more quickly and and and
Alex:
uh uh fine-tune them quickly basically you
Alex:
can build on top of the work of giants i mean
Alex:
you don't have to start from zero every time a lot
Alex:
of like the biggest developer experience innovations just involve like giving
Alex:
developers a higher stair to start walking up so they don't have to start at
Alex:
the bottom of the staircase every single time um and you know that was like the the the big.
Alex:
Like generous give that llama had for the community um and it wasn't you know
Alex:
that wasn't the only company doing open source models, Mastral,
Alex:
came out with 7B Instruct a few months later. It was an incredible model.
Alex:
Then they came out with the first open-weight mixture of experts a few months later.
Alex:
It felt like actual intelligence, but completely open.
Alex:
And all of these provide higher and higher stairs for other developers to kind
Alex:
of like, basically to crowdsource new ideas from the whole planet.
Alex:
Uh and and let these new ideas build on
Alex:
top of really good foundations so and
Alex:
you know when that when that like whole picture started
Alex:
to form into place um it felt like okay this is going to be like a huge inventory
Alex:
situation you kind of like nft collections were a huge inventory situation obviously
Alex:
completely different really different market dynamics really different type
Alex:
of of goal that buyers have.
Alex:
And so a lot of like my early experimentation, like I made like a Chrome extension called Window AI.
Alex:
I did like a few other things were just about learning how the ecosystem works
Alex:
and like what makes it different and how the like, like what people really want,
Alex:
what developers really want.
Josh:
So that leads us to OpenRouter itself, right? So I kind of want you to help
Josh:
explain to the listeners who aren't familiar with OpenRouter what it does.
Josh:
Because I think a lot of people, the way they interact with an AI is they send
Josh:
a prompt to their model of choice.
Josh:
They use ChatGPT or they use the Grok app or they're on Gemini and they kind
Josh:
of live in these siloed worlds.
Josh:
And then the next step up from the people are those kind of who use it professionally,
Josh:
who are developers. They're interacting with APIs.
Josh:
Maybe they're not interfacing with the actual UI, but they're calling a single model.
Josh:
And OpenRouter kind of exists on top of this, right? Can you walk us through
Josh:
how it works and why so many people love using OpenRouter?
Alex:
Open Router is an aggregator and marketplace for large language models.
Alex:
You can kind of think of it as like a Stripe meets Cloudflare for both of them.
Alex:
It's like a single pane of glass. You can orchestrate, discover,
Alex:
and optimize all of your intelligence needs in one place.
Alex:
One billing provider gets you all the models.
Alex:
Uh there's like 470 plus now uh
Alex:
like all the models like they sort of implement features
Alex:
but they do it differently and they also there's
Alex:
a lot of like intelligence brownouts as andre carpoffi calls them yeah where
Alex:
models just go down all the time even the you know even the top models like
Alex:
anthropic and gemini and and open
Alex:
ai um so what we do is you know we like developers need a lot of choice.
Alex:
CTOs need a lot of reliability.
Alex:
CFOs need predictable costs. CISOs need complex policy controls.
Alex:
All of these are inputs to what we do, which is build a single pane of glass
Alex:
that makes models more reliable, lower costs, gives you more choice, and,
Alex:
and then and helps you choose between all the options for where to source your intelligence.
Josh:
How does it work uh because i would imagine like what
Josh:
each as and i on the show we frequently talk about benchmarks right where
Josh:
a certain model is the best at coding and that infers that maybe you should
Josh:
go to that model to do all of your coding needs because it's the best at it
Josh:
but it would appear as if it's not true if you're routing through a lot of different
Josh:
providers so how do you consider which provider gets routed to when and how
Josh:
to get the best result for what you're asking
Alex:
So we've taken a different approach so
Alex:
far which is instead of like focusing on
Alex:
a production router that picks
Alex:
the model for you um we try
Alex:
to help you choose the model so we
Alex:
we build lots we create lots of analytics both on
Alex:
your account and uh and on our
Alex:
rankings page to help you browse and discover the models that
Alex:
like the power users are really using successfully on
Alex:
a certain type of workload um because we
Alex:
think like developers today primarily want to
Alex:
choose the model themselves um switching between all
Alex:
families can result in like a lot like very
Alex:
unpredictable behavior but once you've
Alex:
chosen your model um we try to
Alex:
help developers not need to think about the provider there are
Alex:
like sometimes dozens of
Alex:
providers for a given model uh all kinds
Alex:
of companies including the hyperscalers like aws google vertex and azure um
Alex:
and uh like scaling startups like together fireworks deep infra um and a long
Alex:
tail of providers that provide,
Alex:
like very unique features,
Alex:
very like exceptional performance.
Alex:
There's all kinds of differentiators for them.
Alex:
So what we do is we collect them all in one place. And if you want a feature,
Alex:
you just get the providers that support it.
Alex:
If you want performance, you get prioritized to the providers that have high performance.
Alex:
If you really are cost sensitive, you get prioritized to the providers that
Alex:
are really low cost today. and we basically create all these lanes. There's.
Alex:
Innumerable ways you could get routed but
Alex:
you're in full control of the of the overall user
Alex:
experience that you're aiming for and that's
Alex:
what that's what we found that was missing from the
Alex:
whole ecosystem was just a way of doing that and uh
Alex:
and you know we get like between on average five to ten percent uptime boosts
Alex:
over going to um providers directly just by load balancing and sending you to
Alex:
the top provider that's up and able to handle your request.
Alex:
We really focus hard on efficiency and performance.
Alex:
We only add about 20 to 25 milliseconds of latency on top of your request.
Alex:
It all gets deployed very close to your servers up the edge.
Alex:
We overall get just We stack providers.
Alex:
We figure out what you can benefit from that everybody else is doing and just
Alex:
give you the power of big data as a developer just accessing your model choice.
Josh:
So it kind of allows you to harness the collective knowledge of everybody, right?
Josh:
You get all of the data, you have all of the queries, you know which yields
Josh:
the best result, and you're able to deliver the best product for them.
Josh:
Now, in terms of actual LLMs, EJ has actually pulled this up just before, which is a leaderboard.
Josh:
And I'm interested in how you guys think about LLMs, which are the best,
Josh:
how to benchmark them, and how you route people through them.
Josh:
Is there a specific... Do you believe that benchmarks are accurate,
Josh:
and do you reflect those in the way that you route traffic through these models?
Alex:
In general, we have taken the stance that we want to be the capitalist benchmark for models.
Alex:
What is actually happening?
Alex:
And part of this is that I really think both the law of large numbers and the
Alex:
enthusiasm of power users are really, really valuable for everybody else.
Alex:
Like when you're routing to
Alex:
um like clod in
Alex:
let's say you're routing to clod 4 and you're
Alex:
based in europe um there you
Alex:
know all of a sudden there might be like a huge variance in in throughput from
Alex:
one of the providers and you're only able to detect that if like some other
Alex:
users have discovered it before you and so we route around the provider that's
Alex:
like running kind of slow in Europe and send you,
Alex:
if your data policies allow it,
Alex:
to a much faster provider somewhere else.
Alex:
And that allows you to get faster performance. So, like, um...
Alex:
That's, like, on the provider level, how, like, numbers help.
Alex:
On the, like, model selection level, like, what you see on this rankings page
Alex:
here, power users will, like, when we put up a model, like, we put up a new
Alex:
model today from a new model lab called ZAI,
Alex:
like, the power users instantly discover it.
Alex:
We have this LLM enthusiast community that dives in and really figures out what
Alex:
a model is good for along a bunch of core use cases.
Alex:
The power users figure out which workloads are interesting, and then you can
Alex:
just see in the data what they're doing. And everybody can benefit from it.
Alex:
That's why we open up our data and share it for free on the rankings page here.
Ejaaz:
I'm seeing this one consistent unit across all these rankings,
Ejaaz:
Alex, which is tokens, right?
Ejaaz:
And Josh and I have spoken about this on the show before, but I'm wondering
Ejaaz:
how, like you've chosen this specific unit to measure how good or effective
Ejaaz:
these models are or how consumed or used they are.
Ejaaz:
Can you tell us a bit more as to why you picked this particular unit and what
Ejaaz:
that tells you as like the open router platform as to how a user is using a particular model?
Alex:
Yeah, I think dollars is a good metric too.
Alex:
The reason we chose tokens is primarily because we were seeing prices come down really quickly.
Alex:
Open Router has been around since the beginning of 2023.
Alex:
And I didn't want a model to be penalized in the rankings just because the prices
Alex:
are going down really dramatically now like there's a,
Alex:
There's a paradox called Jevons paradox, which is that when prices decrease like 10x,
Alex:
users' use of some component of infrastructure increases by more than 10x.
Alex:
And so maybe they didn't get 10x at all.
Alex:
But I thought there were some other advantages to using tokens,
Alex:
too. Tokens don't have this penalty and don't rely on Jevon's Paradox,
Alex:
which can have a lot of lag.
Alex:
They also are a little bit of a proxy for time.
Alex:
A model that is generating a lot of tokens and doing so for a while across a lot of users.
Alex:
It means that a lot of people are reading those tokens and actually doing something with them.
Alex:
And same goes for input. But if I really want to send an enormous number of
Alex:
documents and the model has a really, really, really tiny prompt pricing,
Alex:
I think that's still valuable and something that we want to see.
Alex:
We want to see that this model is processing an enormous number of documents.
Alex:
That's a use case that should show up in the rankings.
Alex:
And so we decided to go with tokens. We might like add dollars in the future,
Alex:
but I think tokens are, you know, they don't have this like Jevons Paradox lag.
Alex:
And there wasn't anything else. Like nobody was doing any kind of like overall analytics.
Alex:
We didn't see any other company even do it until Google did a few months ago
Alex:
where they started publishing the total amount of tokens processed by Gemini.
Alex:
So we'll see which use cases really need dollars.
Alex:
But tokens have been holding up pretty well.
Ejaaz:
Yeah, I mean, this dashboard is awesome. And I recommend anyone that's listening
Ejaaz:
to this that can't see our screen to get on OpenRouter's website and check it out.
Ejaaz:
I've been following it for the last two weeks kind of pretty rigorously, Alex.
Ejaaz:
And what I love is you can literally see...
Ejaaz:
So two weeks ago Grok 4 got released right
Ejaaz:
and Josh and I were making a ton of videos on this we were
Ejaaz:
using it with pretty much everything that we could do and
Ejaaz:
then this other model came out of China pretty much a few days after called
Ejaaz:
Kimi K2 and I was like oh yeah whatever this is just some random Chinese model
Ejaaz:
I'm not going to focus on it and then I kept seeing it in my feed and I thought
Ejaaz:
okay maybe I'll give this a go and I kind of like went straight to open rather than just
Ejaaz:
almost gauge the interest from a wider set of AI users. And I saw that it was skyrocketing, right?
Ejaaz:
And then I saw that Quen dropped their models last week.
Ejaaz:
And again, I came to Open Router and it preceded the trend, right?
Ejaaz:
People had already started using it. So I love how you describe Open Router
Ejaaz:
as this kind of like prophetic orb,
Ejaaz:
basically, where the enthusiasts and the community itself can kind of like front
Ejaaz:
run very popular trends. And I think that's a very powerful moat.
Ejaaz:
And kind of on this path, Alex, I noticed that a lot of these major model providers
Ejaaz:
see the value in this, right?
Ejaaz:
So if I'm not mistaken, OpenAI kind of like used your platform to kind of secretly
Ejaaz:
launch their Frontier model before they officially launched it, right?
Ejaaz:
Can you walk us through, you know, how that comes about and more importantly,
Ejaaz:
why they want to do that and why they chose OpenRoddy to do that?
Alex:
Uh open ai will sometimes
Alex:
give uh early access
Alex:
to their to models to some of their customers for
Alex:
testing and we asked them if they
Alex:
wanted to try a stealth model with us which we had never done before um it involved
Alex:
like launching it as under another name and seeing how users respond to it without
Alex:
having any bias or sort of inclination for against the model at the onset.
Alex:
And it would be like a new way of testing it and a new way of...
Alex:
It was like an experiment for both us and them.
Alex:
And they generously decided to take the leap of faith and try it. And we...
Alex:
Launched gpt 4.1 with
Alex:
them at and we called it quasar alpha and
Alex:
it was a million uh
Alex:
token context length model opening us first very
Alex:
very long context model and it was also optimized
Alex:
for coding and the incredible
Alex:
there were a couple incredible things that happened first
Alex:
we have this community uh of benchmarkers
Alex:
that run open source benchmarks and we give
Alex:
a lot of them grants to help fund the benchmarks
Alex:
grants of open router tokens they'll just run the
Alex:
suite of tests against all the models and some of them are very creative like
Alex:
there's one that tests uh like the ability to generate fiction there's one that
Alex:
tests um like how like whether it can make a 3d object project in Minecraft called MCBench.
Alex:
There are a few that test different types of coding proficiency.
Alex:
There's one that just focuses on how good it is at Ruby, because Ruby is,
Alex:
turns out a lot of the models are not great at Ruby.
Alex:
There are a lot of like languages that all the models are pretty bad at.
Alex:
And so we have this like long tail of very niche benchmarks,
Alex:
And all the benchmarkers ran, you know, for free their benchmarks on Quasar
Alex:
Alpha and found pretty incredible results for most of them.
Alex:
And so the model got like, you know, OpenAI got this feedback in real time.
Alex:
We kind of like helped them find it.
Alex:
And they made another snapshot, which we launched as Optimus Alpha.
Alex:
And they could compare the feedback that they got from the two snapshots.
Alex:
Um, and, and then they, and then like two weeks later, they launched GPT 4.1 live for everybody.
Alex:
So it was like, uh, uh, was it an experiment for us?
Alex:
And, and we've done it, um, again since, uh, with, uh, another model provider
Alex:
that, uh, that's still working on it.
Alex:
Um, and it, and it's kind of like a cool way of learning of like crowdsourcing,
Alex:
uh, benchmarks that you wouldn't have expected. and also getting unbiased community sentiment.
Josh:
That's great. So now when we see a new model pop up and we want to test GPT-5,
Josh:
we know where to come to to try it early.
Josh:
We'll see because rumor is it's coming soon. So we'll be, we're on your watch list.
Josh:
But having, I do want to ask you about open source versus closed source because
Josh:
this has been an important thing for us. We talk about this a lot.
Josh:
You have a ton of data on this.
Josh:
I'm looking at the leaderboards there. There are open source models that are
Josh:
doing very well, closed source.
Josh:
What are your takes in general? How do you feel about open source versus closed
Josh:
source models, particularly around how you serve them to users?
Alex:
Both models, both types of models have supply problems, but the supply problems are very different.
Alex:
Typically, what we see with closed source models is that there's there's very
Alex:
few suppliers, usually just one or two.
Alex:
Like with Grok, for example, there's Grok Direct and there's Azure.
Alex:
Um with anthropic there's anthropic direct there's google vertex there's aws
Alex:
bedrock um and then we also like deploy it in different regions like we have
Alex:
an eu deployment um for customers who'd like only want their data like to stay in the eu,
Alex:
and uh and we do custom deployments for
Alex:
the for the closed source models too to just kind of guarantee good
Alex:
throughput high and high rate limits for people um
Alex:
but uh the
Alex:
like a tricky part is that like the the demand usually the like the closed source
Alex:
malls are doing most of the tokens on open router um it's it's dominant you
Alex:
know it's probably 80-ish 70 to 80 percent closed source tokens today.
Alex:
But the open source models have a much more fragmented supply, like cell supply.
Alex:
Side order book um and and like
Alex:
the rate limits for each provider is
Alex:
like a like less stable on average um it
Alex:
usually takes a while for the hyperscalers to serve a
Alex:
new closed source a new open source model um so we so the load balancing work
Alex:
that we do on um open source models tends to be a lot more valuable the load
Alex:
balancing work that we do for closed source models tends to be very focused
Alex:
on caching and feature awareness,
Alex:
making sure you're getting clean cache hits and only transitioning over to new
Alex:
providers when your cache is expired.
Alex:
For open source models, there's way less caching. Very, very few open source
Alex:
models implement caching.
Alex:
And so switching between providers becomes more common. and
Alex:
uh like we we also track a
Alex:
lot of quality differences between the the open
Alex:
source providers some of them will deploy at lower
Alex:
quantization levels which means like it's kind of like a way of compressing
Alex:
the model um generally doesn't have an impact on the quality of the output uh
Alex:
but and yet we still see some odd things from some of the open source providers.
Alex:
And so we run tests internally to detect those outputs. And we're building up
Alex:
a lot more muscle here soon.
Alex:
So that like, they get pulled out of the routing lane and don't affect anyone.
Josh:
So closed source accounts for 80% or something like that, a very large amount.
Josh:
Do you see that changing?
Josh:
Because that post we just had, it's at nine out of the 10 fastest growing LLMs
Josh:
last week, they were open source.
Josh:
And every time it seems like China comes out with another model,
Josh:
it was Kimmy K2 a week or two ago, it kind of really pushes the frontier of open source forward.
Josh:
And the rate of acceleration of open source seems to be as fast,
Josh:
if not faster than closed source, where it's just, it's making these improvements very quickly.
Josh:
It has the benefit of being able to compound in speed because it's open source
Josh:
and everyone can contribute.
Josh:
Do you think that starts to change where the percentage of tokens you're issuing
Josh:
are from open source models versus closed source?
Josh:
Or do you continue to see a trend where it's going to be Google,
Josh:
it's going to be OpenAI that are serving a majority of these tokens to users?
Alex:
In the short term, we're likely to see open source models continue to dominate
Alex:
the fastest growing model category on OpenRouter.
Alex:
And the reason for that is that a lot of users who come for a closed source
Alex:
model, but then decide they want to optimize later,
Alex:
either they want to save on costs or try out a new model that's supposed to
Alex:
be a little bit better in some direction that their app cares about or their use case cares about,
Alex:
then they leave the closed source model and go to an open source model.
Alex:
So open source tends to be like a last mile optimization thing,
Alex:
making a big generalization because the reverse can happen too.
Alex:
And so because it's a last mile optimization thing,
Alex:
the jump from this model is not being used at all to this model is really being
Alex:
used by a couple of people who have
Alex:
left Claude 4 and want to try some new coding use case will be bigger.
Alex:
Than the closed-source models, which start at a really high base and don't have
Alex:
growth quite as dramatic.
Alex:
So the other part of your question, though, was whether there's going to be like a flippening of.
Josh:
Close or some sort of like chipping it away at that monopoly of close source tokens.
Alex:
It's hard to predict these things because, you know,
Alex:
I think like the the biggest problem today with open source models is that the
Alex:
incentives are not as strong like the model lab and the model provider.
Alex:
Um they've you know they're they're
Alex:
sort of established incentives for how to
Alex:
grow as a company and attract good high quality um ai talent and um and giving
Alex:
the model weights away impairs those incentives now like we might see yeah this
Alex:
is where we might see like decentralized providers,
Alex:
helping in the future.
Alex:
A way for like,
Alex:
uh you know like a really good incentive scheme that
Alex:
like allows high quality talent
Alex:
to work on an open source model um
Alex:
that remains open weights at least uh like could fix this i like i you know
Alex:
i stay pretty i try to stay close to the decentralized providers um and like
Alex:
learn a lot from them there's some like cool on the provider side on like on
Alex:
running inference i I think there's some really cool incentive schemes being worked on.
Alex:
But on actually developing the models themselves, I haven't seen too much, unfortunately.
Alex:
So I think if we see one, flipping in the radar. And until we do, I personally doubt it.
Josh:
TBD, do you have personal takes on how you feel about open source versus closed source?
Josh:
Because this has been a huge topic we've been debating too. It's just the ethical
Josh:
concerns around alignment and closed source models versus open source.
Josh:
When you look at the competitors, China, generally speaking,
Josh:
is associated with open source, whereas the United States is generally associated with closed source.
Josh:
And we saw Llama and Meta release the open source models, but now they're raising
Josh:
a ton of money to pay a lot of employees a lot of money to probably develop a closed source model.
Josh:
So it seems like the trends are kind of split between US and China.
Josh:
And I'm curious if you have any personal takes, even outside of OpenRouter,
Josh:
of which you think serves better for the long term outlook on,
Josh:
I mean, the position of the United States or just the general safety and alignment
Josh:
conversation around AI?
Alex:
I mean, like a very simple fundamental difference between the two is that an
Alex:
innovation in open source models can be copied more quickly than an innovation
Alex:
in closed source models.
Alex:
So in terms of velocity and like how far ahead one is over the other,
Alex:
that is like a massive structural difference.
Alex:
That means that closed source models should be theoretically always ahead until
Alex:
a really interesting incentive scheme develops, like I mentioned before.
Alex:
Uh, I think, and I think that's, you know, I don't see like evidence that that's
Alex:
going to change in terms of China versus the U S.
Alex:
Um, it's, I think it's very interesting that China has not had like a major closed source model.
Alex:
Um and i don't really
Alex:
see a great reason why i'm
Alex:
not aware of any reasons that's not that's not going
Alex:
to be going to be the case in the future um my prediction
Alex:
is that there's going to be a closed source model from china um
Alex:
and uh you know if uh uh you know if like it's possible that DeepSeas and Moonshot
Alex:
and Gwen have built up really sticky talent pools.
Alex:
But generally with talent pools, after enough years have passed,
Alex:
people quit and go and create new companies and build new talent pools.
Alex:
And so we should see some of that. It's not the case that the AI space has NDAs
Alex:
or non-competes that the hedge fund space has.
Alex:
That might happen in the future too. But assuming that the current non-compete
Alex:
culture continues, there should be more companies that pop up in China over time.
Alex:
And I'm betting that some of them will be closed source.
Alex:
And my guess is that the two nations will start to look more similar.
Ejaaz:
Yeah, I guess that's why you have Zuck dishing out 300 mil to a billion dollar
Ejaaz:
salary offers to a bunch of these guys, right?
Ejaaz:
One more question on China versus the US. I kind of agree with you.
Ejaaz:
I didn't really expect China to be the one to lead open source anything,
Ejaaz:
let alone the most important technology of our time.
Ejaaz:
Do you think is their secret source to building these models, Alex?
Ejaaz:
And I know this might be out of the forte of
Ejaaz:
open router specifically but as someone who has studied this technology for
Ejaaz:
a while now i'm struggling to figure out you know what advantage they had you
Ejaaz:
know they're discovering all these new techniques and maybe the simple answer
Ejaaz:
is like constraints right they don't have access to all of
Ejaaz:
nvidia's chips they don't have access to infinite compute so then maybe they're
Ejaaz:
forced to kind of like figure out other ways around the same kinds of problems
Ejaaz:
that western companies are focused on But it's pretty clear that America, with all its funding,
Ejaaz:
hasn't been able to make these frontier breakthroughs.
Ejaaz:
So I'm curious whether you are aware of or know some kind of technical moat
Ejaaz:
that Chinese AI researchers or these AI teams that are featuring on Open Rata
Ejaaz:
day in and day out have over the U.S.?
Alex:
Well, I don't know.
Alex:
There are certainly some that they've come up with that like DeepSeek had a
Alex:
lot of very cool inference innovations that they published in their paper.
Alex:
But a lot of what they published in the original R1 paper were things that OpenAI
Alex:
had done independently themselves many months before.
Alex:
So uh i like
Alex:
on the inference side and on
Alex:
uh some of the model side i think like deep seek we we
Alex:
had talked to their team for years before r1 came
Alex:
out they had many models before that and
Alex:
they were always like a pretty sharp optimum like
Alex:
team for doing inference um like they
Alex:
came up with like the best user experience for caching prompts
Alex:
long before deep cpr1 came out and they had very good pricing um they uh they
Alex:
were just they were like you know by far the the strongest chinese team um that
Alex:
we were aware of uh well before that happened and so i'm guessing there was like some talent.
Alex:
Uh accumulation that they were working on in china
Alex:
for people who wanted to stay in china and yeah that's
Alex:
that's a huge advantage like american companies are obviously not
Alex:
doing that there's a duck is very on
Alex:
point that a lot of this is just based on talent
Alex:
um there are a lot of
Alex:
ai is open and out there and just like and
Alex:
very composable like a big tree of knowledge
Alex:
there's a paper that comes out and it cites like
Alex:
20 other papers and you can go and read all
Alex:
of the cited papers and then you like have kind of
Alex:
a basis for understanding the paper but you really have to
Alex:
go one level deeper and read all the cited papers two levels
Alex:
down to really understand what's going on and it's.
Alex:
Just that no very few people can do that um and
Alex:
it takes like a lot of years of experience to like actually
Alex:
apply that knowledge and learn all these
Alex:
things that have not been written in any paper at all and uh
Alex:
and there's just there's just such such it
Alex:
like a small number of people um who can
Alex:
really lead research on all the different dimensions that
Alex:
go on to making a model and uh um and
Alex:
and like the the border between china and the u.s is
Alex:
is pretty defined you have to leave china move to the u.s
Alex:
and really establish yourself here um so
Alex:
i do think there's like country arbitrage there's like
Alex:
there's you know the head the hedge fund background arbitrage there's uh there's
Alex:
there's hardware arbitrage like there's like a ton of hardware that's only available
Alex:
in china but not here vice versa that creates an opportunity um and this this
Alex:
will just continue to happen.
Ejaaz:
Yeah, I think this arbitrage is fascinating.
Ejaaz:
I read somewhere that there's probably less than 200 or 250 researchers in the
Ejaaz:
world that are worthy of working at some of these frontier AI model labs.
Ejaaz:
And I looked into some of the backgrounds of the team behind Kimi K2,
Ejaaz:
which is this recent open source model out of China, which broke all these crazy rankings.
Ejaaz:
I think it was like a trillion parameter model or something crazy like that.
Ejaaz:
And a lot of them worked at some of the top American tech companies.
Ejaaz:
And they all graduated from this one university in China.
Ejaaz:
I think it's Tsinghua, which apparently is like, you know, the Harvard of AI
Ejaaz:
in China, right? So pretty crazy.
Ejaaz:
But Alex, I wanted to shift the focus of the conversation to a point that you
Ejaaz:
brought up earlier in this episode, which is around data.
Ejaaz:
Okay, so here's the context that like Josh and I have spoken about this at length, right?
Ejaaz:
We are obsessed with this feature on OpenAI, which is memory, right?
Ejaaz:
And I know a lot of the other memory, sorry, a lot of the other AI models have memory as well.
Ejaaz:
But the reason why we love it so much is I feel like the model knows me, Alex.
Ejaaz:
I feel like it knows everything about me. It can personally curate any of my prompt.
Ejaaz:
It just gets me. It knows what I want and it just serves up to me in a platter
Ejaaz:
and off I go, you know, doing my thing.
Ejaaz:
Now, Open Router sits on top of like kind of like the query layer, right?
Ejaaz:
So you have all these people writing all these weird and wonderful prompts and
Ejaaz:
kind of routing it through on towards like different AI models.
Ejaaz:
You hold all of that data or maybe you have access to all of that data.
Ejaaz:
And I know you have something called private chat as well, where you don't have access to it.
Ejaaz:
Talk to me about like what OpenRouter and what you guys are thinking about doing
Ejaaz:
with this data, because presumably,
Ejaaz:
or in my opinion, you guys have actually the best mode, arguably better than
Ejaaz:
ChatGPT, because you have all these different types of prompts coming from all
Ejaaz:
these different types of users for all these different types of models.
Ejaaz:
So theoretically, you could spin up some of the most personal AI models for
Ejaaz:
each individual user if you wanted to.
Ejaaz:
Do I have that correct? Or am I, you know, speaking crazy?
Alex:
No, that's true. No, it's something we're thinking about.
Alex:
By default, your prompts are not logged at all.
Alex:
We don't have prompts or completions for new users by default.
Alex:
You have to toggle it on in settings.
Alex:
But the result, a lot of people do toggle it on. And as a result,
Alex:
I think we have by far the largest multi-model prompt data set.
Alex:
Uh, but what we've done today, we've barely done anything with it.
Alex:
We classify a tiny, tiny, tiny subset of it. And that's what you see in the rankings page.
Alex:
Um, but, uh, what it could be done on like a per account level is really,
Alex:
um, like three main things.
Alex:
One memory right out of the box. You can, you can get this today by like combining
Alex:
open router with like a memory as a service. We've got a couple of companies
Alex:
that do this, like Memzero and SuperMemory.
Alex:
And we can partner with one of those companies or do something similar and just
Alex:
provide a lot of distribution.
Alex:
And that basically gets you a chat GPT as a service where it feels like the
Alex:
model really knows you and the right context gets added to your prompt.
Alex:
The other things that we can do are help you select the right model more intelligently.
Alex:
There's a lot of models where there's like a super clear, like migration decision that needs to be made.
Alex:
And, and we can just see this very clearly in the data.
Alex:
But we right now we just like, you know, we have like a channel or like some
Alex:
kind of communication channel open with the customer, we can just tell them
Alex:
like, hey, and we know you're using this model a ton.
Alex:
It's been deprecated. This model is significantly better. you
Alex:
should move this kind of workload over to it or like
Alex:
this workload you'll get way better pricing if you do this um
Alex:
and and that's basically like that's the
Alex:
only sort of guidance and kind of like
Alex:
opinionated routing we've done so far and it could
Alex:
be a lot more intelligent a lot more out of the box a lot more
Alex:
built into the product um and then
Alex:
the the last thing
Alex:
we can do i mean there's there's probably tons of
Alex:
things we're not even thinking about um but
Alex:
like getting really
Alex:
really smart about how
Alex:
models and providers are responding to prompts and
Alex:
uh showing you just the really coolest
Alex:
data just like telling you
Alex:
what kinds of of prompts um are
Alex:
going to which models and how those models are replying and
Alex:
just like characterizing the reply in all kinds of interesting ways
Alex:
like did the model refuse to answer what's the refusal rate
Alex:
did the model um did the.
Alex:
Model like successfully make a tool call or did it decide to
Alex:
ignore all the tools that you passed in that's a huge one
Alex:
um did the model like pay
Alex:
attention to its context did uh you know did what did did some kind of truncation
Alex:
happening happen before you sent it to the model So there's all kinds of like
Alex:
edge cases that cause developers apps to just get dumber and they're all detectable.
Ejaaz:
I'm so happy you said that because I have this kind of like hot take,
Ejaaz:
but maybe not so hot take, which is I actually think all the Frontier models
Ejaaz:
right now are good enough to do the craziest stuff ever for each user.
Ejaaz:
But we just haven't been able to unlock it because it just doesn't have the context.
Ejaaz:
Sure, you can attach it to a bunch of different tools and stuff,
Ejaaz:
but if it doesn't know when to use the tool or how to process a certain prompt
Ejaaz:
or if the users themselves don't know how to read
Ejaaz:
the output of the AI model themselves, like you just said, we need some kind
Ejaaz:
of analytics into all of this,
Ejaaz:
then we're just kind of walking around like headless chickens almost.
Ejaaz:
So I'm really happy that you said that. One other thing that I wanted to get
Ejaaz:
your take on on the data side of things is, I just think this whole concept
Ejaaz:
or notion of AI agents is becoming such a big trend, Alex.
Ejaaz:
And I noticed a lot of Frontier Model Labs release new models that kind of spin
Ejaaz:
up several instances of their AI model.
Ejaaz:
And they're tasked with a specific role, right?
Ejaaz:
Okay, you're going to do the research. You're going to do the orchestrating.
Ejaaz:
You're going to look online via a browser, blah, blah, blah,
Ejaaz:
blah, blah. And then they coalesce together at the end of that little search
Ejaaz:
and refine their answer and then present it to someone, right?
Ejaaz:
You know, Grok4 does this, Claude does this, and a few other models.
Ejaaz:
I feel like with this data that you're describing, OpenRouter could be or could
Ejaaz:
offer that as a feature, right?
Ejaaz:
Which is essentially, you can now have super intuitive, context-rich agents
Ejaaz:
that can do a lot more than just talk to you or answer your prompts.
Ejaaz:
But they could probably do a bunch of other actions for you.
Ejaaz:
Is that a fair take, or is that something that maybe might be out of the realm of open router?
Alex:
Our strategy is to be the best inference layer for agents.
Alex:
And what I think developers want...
Alex:
Is control over how their agents work.
Alex:
And our developers at least want to use us as a single pane of glass for doing
Alex:
inference, but they want to see and control the way an agent looks.
Alex:
An agent is basically just something
Alex:
that is doing inference in a loop and controlling the direction it goes.
Alex:
So um what what
Alex:
we want to do is just like build incredible docs
Alex:
really good primitives that make that easy
Alex:
to do so that you know like i think like
Alex:
a lot of our developers are just people building agents and so
Alex:
what they want is they want the primitives to
Alex:
be solved so that they can just keep creating new
Alex:
versions and new ideas um without worrying
Alex:
about like you know re-implementing tool calling over
Alex:
and over again and um and and
Alex:
and so like at least for this is like a it's it's
Alex:
a tough problem given how many models there's like a new model or provider every
Alex:
day and uh and people actually want them and use them so uh to standardize this
Alex:
like make make these tools like really dependable um that's kind of like where
Alex:
we want to focus and uh so that like agent developers don't have to worry about it.
Josh:
As we level up towards closer and closer to getting to AGI beyond,
Josh:
I'm curious what Open Router's kind of endgame is.
Josh:
If you have one, what is the master plan where you hope to end up?
Josh:
Because the assumption is as these systems get more intelligent,
Josh:
as they're able to kind of make their own decisions and choose their own tool
Josh:
sets, what role does Open Router play in continuing to route that data through?
Josh:
Do you have a kind of master plan, a grand vision of where you see this all heading to?
Alex:
You're saying like as agents get better at choosing the tools that they use
Alex:
what what becomes our role when like the agents are really good at that yes.
Josh:
Yes and like where do you see open router fitting into the picture and what
Josh:
would be the best case scenario for this this future of open router
Alex:
Right now open routers bring your own tool,
Alex:
platform um we don't have like a
Alex:
marketplace of mcps yet uh and
Alex:
and i i do think like a lot of the i think most of the most used tools will
Alex:
be ones that developers configure themselves agents just work like they're given
Alex:
access to it like i think like a holy grail for for open router is that.
Alex:
The the ecosystem is going to like basically my
Alex:
prediction for how the ecosystem is going to evolve is that um
Alex:
all the models are going to be adding state and
Alex:
other kinds of stickiness that just make you want to stick
Alex:
with them so they're going to add server-side tool calls
Alex:
they're going to add like um you know web search that that is stateful they're
Alex:
going to add memory They're going to add all kinds of things that try to prevent
Alex:
developers from leaving and increase lock-in.
Alex:
And OpenRouter is doing the opposite.
Alex:
We want developers to not feel vendor lock-in.
Alex:
We want them to feel like they have choice and they can use the best intelligence,
Alex:
even if they didn't before.
Alex:
It's never too late to switch to a more intelligent model. That would be like,
Alex:
you know, a good always on outcome for us.
Alex:
And so what I think we'll end up doing is, is like partnering with other companies
Alex:
or building the tools ourselves if we have to, so that developers don't feel stuck.
Alex:
That's how I, you know, there's a lot of ways the ecosystem could evolve,
Alex:
but that's how I would put it in a nutshell.
Josh:
Okay, now there's another personal question that I was really curious about,
Josh:
because I was also right there with you in the crypto cycle when NFTs got absolutely
Josh:
huge, was a big user of OpenSea.
Josh:
And it was kind of this trend that went up and then went down.
Josh:
And NFTs kind of fizzled out, it wasn't as hot anymore, and AI kind of took the wind from the sails.
Josh:
And it's a completely separate audience, but a similar thing where now it's
Josh:
the hottest thing in the world.
Josh:
And i'm curious how you see the trend continuing is this a cyclical thing that
Josh:
has ups and downs or is this a one-way trajectory of more tokens every day more
Josh:
ai every day is do you see it being a cyclical thing or is this a a one-way
Josh:
trend towards up into the right nfts
Alex:
Kind of follow uh crypto in a,
Alex:
indirect way um when crypto
Alex:
has ups and downs nfts generally lag a bit
Alex:
but they they have similar ups and downs and um
Alex:
and crypto is an extremely long-term play on like building a new financial system
Alex:
and there are so many reasons that it's not going to happen overnight um and And they're like,
Alex:
it's very, very entrenched reasons.
Alex:
Whereas AI, there are some overnight business transformations going on.
Alex:
And the reason AI, I think, moves a lot, one of the reasons that AI moves a
Alex:
lot faster is it's just about making computers behave more like humans.
Alex:
So if a company already works with a bunch of humans, then there's,
Alex:
you know, there's some engineering that needs to be done.
Alex:
There's some like thinking about how
Alex:
to like scale this but
Alex:
but in general i think that it's not like
Alex:
after seeing what can be possible um inference
Alex:
will be the fastest growing operating expense for all companies
Alex:
it'll it'll be like oh we can just hire
Alex:
high-performing employees at a click of a
Alex:
and they they work 24 7 they
Alex:
scale elastically it's like you know
Alex:
it it's not that hard it's not like huge mental
Alex:
model shift it's just like a huge upgrade to the way companies work today um
Alex:
in most cases so it's just completely different from crypto there's there's
Alex:
like other than both being you know than nfts i mean other than both being new
Alex:
they're fundamentally very different changes.
Ejaaz:
You're probably one of very few people in the world right now that has crazy
Ejaaz:
insights to every single AI model.
Ejaaz:
Definitely more than the average user, right? Like I have like three or four
Ejaaz:
subscriptions right now and I think I'm a hotshot.
Ejaaz:
You get access to like 400 and what is it? 57 models right now on OpenRata.
Ejaaz:
So an obvious question that I have for you is
Ejaaz:
I'm not going to say in the next couple of years, because everything moves way
Ejaaz:
too quickly in this sector.
Ejaaz:
But over the next six months, is there anything really obvious to you that should
Ejaaz:
be focused on within the AI sector?
Ejaaz:
Maybe it's like the way that certain models should be designed,
Ejaaz:
or perhaps it's at the application layer that no one's talking about right now.
Ejaaz:
Because going on from our earlier part of the conversation, you just pick these
Ejaaz:
trends out really early. and I'm wondering if you see anything.
Ejaaz:
It doesn't have to be open-racket related. It could just be AI related.
Alex:
I've seen the models trending towards caring more about how resourceful they
Alex:
are than what knowledge they have in the bank.
Alex:
Not all of, I feel like a lot of the applications, I think the model labs maybe,
Alex:
a lot of them, I don't know how many of them really deeply believe that,
Alex:
but a couple of them uh talk about it and i don't think it's really hit the
Alex:
application space yet um because people will will ask chat gpt things and if
Alex:
the knowledge is wrong they think the model is stupid,
Alex:
and that's just kind of a bad way of evaluating a model um
Alex:
like whatever knowledge a person has whatever
Alex:
a person like where calls happen at a certain time like
Alex:
does not it's not a proxy for how smart they are um
Alex:
like the the intelligence and usefulness of a model
Alex:
is going to trend towards how good it is at using tools and
Alex:
uh and and how good it is at like paying
Alex:
attention to its context of a long long long long context and so it's like it's
Alex:
it's total memory capacity and accuracy um so i think those two things need
Alex:
to be like emphasized more um the.
Alex:
Like it might be that that models pull all
Alex:
of their knowledge from like online databases
Alex:
from like real-time uh scraped
Alex:
index indices of the web along with a
Alex:
ton of real-time updating data sources um and
Alex:
they're never they're always kind of like relying on some some sort of database
Alex:
for knowledge but relying on their reasoning process for for tool calling you
Alex:
know like we we put it We spend probably the plurality of our time every week
Alex:
on tool calling and figuring out how to make it work really well.
Alex:
Humans, the big difference between us and animals is that we're tool users and tool builders.
Alex:
And that's where human acceleration and innovation has happened.
Alex:
So how do we get models creating tools and using tools very,
Alex:
very effectively? there's very little,
Alex:
There are very few benchmarks. There's very little priority.
Alex:
There's the Tau Bench for measuring how good a model is at tool calling.
Alex:
But there's, and there's like maybe a few others.
Alex:
There's Swee Bench for measuring how good a model is at multi-turn programming tasks.
Alex:
It's very, very hard to run, though. It costs like, you know,
Alex:
for Sonnet, it could cost like $1,000 to run it.
Alex:
And it's like the user experience for kind of like evaluating the real intelligence
Alex:
of these models is not good.
Alex:
And so like I love, as much as we don't have benchmarks listed on OpenRouter
Alex:
today, I love benchmarks.
Alex:
And I think like the app ecosystem and like developer ecosystem should spend
Alex:
a lot more time making very cool and interesting ones.
Alex:
Also, we will give credit grants for all the best ones. So I highly encourage it.
Ejaaz:
Well, Alex, thank you for your time today. I think we're coming up on a close
Ejaaz:
now. That was a fascinating conversation, man.
Ejaaz:
And I think your entire journey from just non-AI stuff, so OpenSea all the way
Ejaaz:
to OpenRouter has just been a great indicator of where these technologies are
Ejaaz:
progressing and more importantly, where we're going to end up.
Ejaaz:
I'm incredibly excited to see where OpenRatter goes beyond just prompt routing.
Ejaaz:
I think some of the stuff you spoke about on the data side of things is going
Ejaaz:
to be fascinating and arguably one of your bigger features. So I'm excited for future releases.
Ejaaz:
And as Josh said earlier, if GPT-5 is releasing through your platform first,
Ejaaz:
please give us some credits. We would love to use it.
Ejaaz:
But for the listeners of this show, as you know, we're trying to bring on the
Ejaaz:
most interesting people to chat about AI and Frontier Tech. We hope you enjoyed this episode.
Ejaaz:
And as always, please like, subscribe, and share it with any of your friends
Ejaaz:
who would find this interesting. And we'll see you on the next one. Thanks, folks.
