Announcing Google's Secret New AI Model With The Person Who Built It | Logan Kilpatrick
Logan:
The best image generation and editing model in the world.
Ejaaz:
It's scary how realistic this stuff is. VO3 has kind of like killed the VFX studio.
Logan:
And this is, I think, principally enabled by Vibe Coding. My hope is that it
Logan:
actually ends up creating more opportunity for the experts and the specialists.
Josh:
How much of the tools that you build do you find are built with Vibe Coding?
Logan:
I'm like almost 85% of everything that I do Vibe Coded.
Ejaaz:
I remember when I first booted up a PC and I just had access to all these different
Ejaaz:
wonderful applications all within one suite. This kind of feels like that moment for AI.
Josh:
Gemini is feeling faster, but it's also feeling better, and it's also getting cheaper.
Ejaaz:
What's happening behind the scenes?
Logan:
We crossed quadrillion tokens, which comes after a trillion if you're not.
Logan:
You haven't thought about numbers higher than a trillion before.
Logan:
It's what comes after a trillion, and there's no slowdown in sight.
Josh:
We have an incredibly exciting episode today because we are joined by Logan
Josh:
Kilpatrick. Logan is the product lead working on the Gemini platform at Google DeepMind.
Josh:
We have an exciting announcement to break right here today with Logan,
Josh:
which is the announcement of a model that we previously knew as Nano Banana.
Josh:
The reality is this is a brand new image generation model coming out of Google
Josh:
and you can access it today.
Josh:
So Logan, tell us about this brand new model and what we need to be excited about.
Logan:
Yeah, for people who are not chronically online and seeing all the tweets and everything like that,
Logan:
part of the excitement has been in over the last, I think, like six months,
Logan:
we've seen the emergence of like native image
Logan:
generation editing models historically um you
Logan:
would see models that could actually do a really good job of generating
Logan:
images um they were usually like tend to be like very like beautiful um aesthetic
Logan:
images the challenge was like how do you actually use these things in practice
Logan:
to do a lot of stuff that's where this editing capability is really helpful
Logan:
um and then so we started to see these models that can actually edit images
Logan:
if you could provide an image,
Logan:
it would, and then prompt it, it would, it would actually change that image.
Logan:
What's really interesting though, is this fusion of those two capabilities with
Logan:
the actual base intelligence of the Gemini model.
Logan:
And there's a lot of really cool ways in which this like manifests itself.
Logan:
And we'll look at some examples of this.
Logan:
But it's this benefit of the world knowledge. The model is like smart.
Logan:
So as you, as you ask it to do things, and as you ask it to make changes,
Logan:
it doesn't just like take what you're saying at face value.
Logan:
It takes what you're saying in the context of its understanding of the world
Logan:
and its understanding of physics, its understanding of light and all this other
Logan:
stuff, and it makes those changes.
Logan:
So it's not just blindly making edits or generations. They're actually grounded
Logan:
in reality and context in which that's useful.
Logan:
And we can look at some examples of this. My favorite thing is actually this editing capability.
Logan:
So this is an AI studio, and we'll have a link somewhere, hopefully in the show
Logan:
notes, that will let us do this. My friend Amar, who...
Logan:
Is on our team and drives all of our design stuff um build
Logan:
this and it's called past forward and what you can do is you can
Logan:
put in an image of yourself and it'll regenerate a
Logan:
version of yourself um in this
Logan:
sort of like polaroid-esque vibe um following all the different trends from
Logan:
the last 10 or 20 uh 30 years um so if you look at this example this is from
Logan:
me from the 1950s and And I'm sure I have a picture of my dad from the 1950s
Logan:
somewhere or my grandpa who looks somewhat similar to that.
Logan:
Here's me in the 1980s, which I love. Here's me.
Ejaaz:
Some of these facial expressions are also different. Like you're showing your
Ejaaz:
teeth more in some and then it's a smirk in others. That's super cool.
Logan:
I like this sweater. I actually have a sweater that almost looks exactly like
Logan:
this 1970s one, though I don't like my hair in this 1970s one.
Logan:
Same with the 2000s. um so one of the cool things
Logan:
about this uh this new model and one of the features i
Logan:
think folks are going to be most excited about is this character consistency which
Logan:
is as you took the original image um and you
Logan:
made the translation to this 1950s image it actually
Logan:
looks like me still um which is really cool so
Logan:
there's lots of these really interesting use cases i think we'll
Logan:
we'll go out with a um um like a
Logan:
sports card demo where you can sort of turn yourself into
Logan:
a you know a figurine sports card um
Logan:
which is really cool so lots of really interesting examples like
Logan:
this um and another thing you'll notice is actually the
Logan:
speed and this is where the underlying model is not
Logan:
uh the code name was nano nano banana um
Logan:
the actual model is built on gemini 2.5 flash which is our workhorse model it's
Logan:
super fast it's super efficient um it's relatively priced in the market which
Logan:
is awesome so you can actually use it at scale um and yeah so this model behind
Logan:
the scenes or for developers who people want to build with it is Gemini 2.5 flash image,
Logan:
which is awesome. So this is a use case that I love.
Logan:
And it's a ton of fun. You can do this in the Gemini app or in AI Studio.
Ejaaz:
I mean, as you said, the character consistency just from these examples is like astounding.
Josh:
I need to give a round of applause. This has been my biggest issue when I'm
Josh:
generating images of myself.
Ejaaz:
Genuinely. And Josh and I are early users of, you know, Mid Journey V1,
Ejaaz:
OpenAI's Image Generator as well.
Ejaaz:
And one of our pet peeves was it just couldn't do the most simplistic things,
Ejaaz:
right? We could just say, hey, keep this photo and portrait of me exactly the same.
Ejaaz:
But can you show me what I would look like in a different hairstyle or me holding
Ejaaz:
a bottle of Coca-Cola instead of this martini?
Ejaaz:
And it just could not do that, right? Just simple video, like photo editing.
Ejaaz:
Can you give us a bit of a background as to what Google did to be able to achieve this?
Ejaaz:
Because, you know, I've been racking my head around like, why other AI companies
Ejaaz:
couldn't do this? Like, what's happening behind the scenes? Can you give us a bit of insight?
Logan:
Yeah, that's a good question. I think this actually goes back to and I'll share
Logan:
another example in a second as well.
Logan:
But I think this goes back to this story of what happens when you build a model
Logan:
that has the fusion of all these capabilities together.
Logan:
And I was actually just This is a sort of parallel example to this,
Logan:
but it's another example of why building a unified model to do all this stuff
Logan:
and not having a separate model that doesn't have world knowledge and all these
Logan:
other capabilities is useful. The same thing is actually true on video.
Logan:
Like part of the story, and we haven't, we have a bunch of stuff coming that
Logan:
sort of tells this a little bit more elegantly than I will right now.
Logan:
But part of the story of like VO3 having this really state-of-the-art video
Logan:
generation capabilities, if folks have seen this, is that the Gemini models
Logan:
themselves have this state-of-the-art video understanding capabilities.
Logan:
And a very similar context, actually, on the image side, which is since the
Logan:
original Gemini model, we've like,
Logan:
with the exception of probably a couple of months in that like two and a half
Logan:
year time horizon, have had state-of-the-art image understanding capabilities.
Logan:
And I think there is this like capability transfer, which is really interesting
Logan:
as you go to do the generation step.
Logan:
And if you can fuse those two things together in the same model,
Logan:
you end up just being able to do things that other models aren't able to do.
Logan:
And this was part of the bet originally that like, why build Gemini to be the original Gemini?
Logan:
Gemini 1.0 model was built to be natively multimodal. It was built to be natively
Logan:
multimodal because the belief at the time,
Logan:
and I think this is turning out to be true is that like that's on the path to
Logan:
AGI is that you combine these capabilities together and like similar to what
Logan:
humans are able to do we have this fusion of all these capabilities in a single
Logan:
entity just like these models should be able to do.
Ejaaz:
Wow. So if I were to distill what you just said here, Logan,
Ejaaz:
the way you've trained Gemini 2.5 or all future Google Gemini models is it's
Ejaaz:
in a very multimodal fashion.
Ejaaz:
So you're basically, it gets smarter in one particular facet,
Ejaaz:
which trains itself or has transferable capabilities to other facets,
Ejaaz:
whether it's image generation, video generation, or even text LLMs to some extent.
Ejaaz:
I just think that's fascinating.
Ejaaz:
I'm curious, I have one question for you, which I want to hear your take on.
Ejaaz:
How are you going to surface this to the regular consumer, right?
Ejaaz:
Because right now, you provide all of these capabilities through an amazing
Ejaaz:
suite, you know, called Google AI Studio.
Ejaaz:
But if I wanted to use this in, say, an Instagram app, or my random photo imaging
Ejaaz:
editing app, is this something that could be easily proved to someone or sourced?
Ejaaz:
Or do we need to go via some other route right now.
Logan:
Let me just diverge really quickly, which is if any of the researchers who I
Logan:
work with are watching this, they will tell me, they'll make sure that I know
Logan:
that capability transfer that we just talked about, you oftentimes don't get that out of the box.
Logan:
So there is some emergence where you get a little bit of that.
Logan:
You do have to do, there's real true research and engineering work that has
Logan:
to happen to make sure that that capability fusion happens.
Logan:
It's not often that you just like make the model really
Logan:
good at one thing and then it translates oftentimes actually it's
Logan:
like uh it has a negative effect which is as you
Logan:
make the models really good at code for example you trade
Logan:
that off against some other you know creative writing as as a random example
Logan:
of this um so you have to do a lot of like active research and engineering work
Logan:
to make sure that you don't lose a capability as you make another one better
Logan:
but then ultimately they benefit as if you can make them on the same level they
Logan:
benefit from this interleaved capability together.
Logan:
To answer the question about like, where is this going to be available?
Logan:
The Gemini app is the place that like for by and large, most people should be going to.
Logan:
So if you go to Gemini.Google.com, there'll be sort of a landing page experience
Logan:
that showcases this new model and makes it really easy.
Logan:
And you can put in all your images and do tons of fun stuff like the example that I was showing.
Logan:
If you're a developer and you want to build something with this,
Logan:
in AI Studio, we have this build tab.
Logan:
And that's what we were just looking at as an example of one of the applets
Logan:
that's available in the build tab,
Logan:
the general essence is that all of these applets can be forked and remixed and
Logan:
edited and modified so that you can keep doing all the things that you want
Logan:
to do with the AI capability built in.
Logan:
So it'll continue to be powered by the same model. It'll do all that stuff, which is awesome.
Logan:
So there's lots of cool fusion capabilities that we have with this.
Logan:
Same thing with this other example that we're looking at. So if you want to
Logan:
go outside of this environment, we have an API.
Logan:
You could go and build whatever. saw if your website is, you know,
Logan:
AI photos.com or whatever, you could go and build with the Gemini API,
Logan:
use the new Gemini 2.5 flash image model to do a bunch of this stuff, which is awesome.
Josh:
Awesome. So while this is baking, I noticed you had another tab open,
Josh:
which means maybe there's another demo that you were prepared to share.
Logan:
There is another demo. This one I actually haven't tried yet.
Logan:
But it's this idea of like, how can you take a photo editing experience and
Logan:
make it super, super simple? So I'll grab an image.
Logan:
Actually, we'll take this picture, which is a picture of Demis and I.
Josh:
Legends.
Logan:
We'll put an anime filter on it and we'll see.
Logan:
And so this is a completely vibe coded UI experience and all the code behind
Logan:
the scenes is vibe coded as well.
Logan:
And we'll see how well this works with Demis and I.
Josh:
How much of the tools that you build do you find are built with vibe coding
Josh:
instead of just hard coding software?
Josh:
Are you writing a lot of this as vibe coded through the Gemini model?
Logan:
I think you sometimes you're able to do some of the stuff,
Logan:
completely vibe coded um it depends on like
Logan:
how specific that you want to do i do i'm like almost 85
Logan:
of everything that i do vibe coded somebody else on
Logan:
my team built this one so i don't want to misrepresent the work it could
Logan:
have it could have all been human programmed because we have an incredible set
Logan:
of engineers the general idea is how can you make this oh interesting how can
Logan:
you make this photoshop like experience let's go 90 or do you have suggestions
Logan:
what would a good filter for this be i don't know oh.
Josh:
Man i yeah like perhaps uh going back to
Josh:
the last example maybe like a a 90s film or an
Josh:
80s film grain all right and i guess while we wait for that to load is there
Josh:
a simple way that you would describe a nano banana or this new image model to
Josh:
just the average person on the street who's oh look there we go we have the
Josh:
film grain okay so what we're watching for the people who are listening um you're
Josh:
retouching you can retouch parts of the image you could crop adjust there are filters to be applied
Logan:
I'm just clicking through buttons to be honest, I've never done it before. So it's been fun.
Logan:
Live demo, day one. This is the exploration you're going to get to do as a user
Logan:
as you play around with this.
Ejaaz:
Logan is vibe editing and that's what's happening. Yeah. He's experimenting.
Logan:
Vibe editing, which is fun. I love it. That's a great way. And the cool thing,
Logan:
again, is like what I love about this experience is as you're going through,
Logan:
oh, interesting, this one's like giving me edited outline.
Josh:
Oh, yeah, a little outline. This is helpful for our thumbnail generation. We do a lot of this stuff.
Logan:
Let's see if I can remove the background as well.
Ejaaz:
Oh, yeah. Let's see. I should be.
Josh:
If this removes the background, this is going to be trouble because this is
Josh:
a big feature that we use for a lot of our imagery.
Logan:
Hopefully. Come on. Oh, nice.
Ejaaz:
Oh, done.
Josh:
Nicely done.
Ejaaz:
For those of you who are listening, he's typed in, put me in the Library of
Ejaaz:
Congress. So we're going to hopefully see Logan.
Logan:
Yeah, the context on that image was that Demis and I were in the Library of the DeepMind office.
Ejaaz:
Oh, nice.
Logan:
Yeah, so that was the Library of Congress reference in my mind.
Logan:
But yeah, so much that you can do.
Logan:
Again, what I love about this experience is that as you go around and play with
Logan:
this stuff, if you want to modify this experience, you can do so on the left-hand side.
Logan:
If you say, actually, here are these five editing features that I really care
Logan:
about, the model will go and rewrite the code, and then it'll still be attached
Logan:
to this new 2.5 flash image model.
Logan:
So you can do all these types of cool stuff. This experience is something that
Logan:
I'm really excited about that we've been pushing on.
Josh:
Yeah, this is amazing because I myself, I do photography a lot.
Josh:
I was a photographer in my past life and I rely very heavily on Photoshop and
Josh:
Lightroom for editing, which is a very manual process.
Josh:
And they have these smart tools, but they're not quite like this.
Josh:
I mean, this saves a tremendous amount of time if I could just say,
Josh:
hey, realign, restrain the image, remove the background, add a filter.
Josh:
I think the plain English version of this makes it really approachable, but also way faster.
Logan:
Yeah, it is. It is crazy fast. I think about this all the time.
Logan:
Like there's definitely cases where you want to go deep with whatever the pro tool is.
Logan:
I think there's, there's actually something interesting, like on the near horizon
Logan:
that our team has thought a lot about, which is how you can have this experience
Logan:
and how you can sort of in a, in a generative UI capacity,
Logan:
have the experience sort of subtly expose additional detail to users.
Logan:
And I think about this like if you're a new you know photoshop user
Logan:
as an example and you show up like the chance that you're
Logan:
going to use all of the bells and whistles is zero like you want
Logan:
like the three things i want to remove a background i want to crop something
Logan:
whatever it is like don't actually show this all of these bells and whistles
Logan:
i think the exciting thing about like the progress on coding models is that
Logan:
in the future the challenge with the challenge with doing this in the present
Logan:
rather is that software is deterministic.
Logan:
You have to build software to build the sort of like modified version of that
Logan:
software for all of these different like skill sets and use cases is extremely
Logan:
expensive. It's not feasible.
Logan:
It doesn't scale to production environments. But if you can have this generative
Logan:
UI capability where like the model sort of knows and as you talk to the model,
Logan:
it realizes, oh, you might actually benefit from these other things.
Logan:
It can create the code to do that on the fly and expose them to you,
Logan:
which is really interesting.
Logan:
So I think there's lots of stuff that is going to be possible as the models keep getting better.
Josh:
This is amazing. So the TLDR on this new announcement, how would I,
Josh:
if I were to go explain to my friend what this does, why this is special,
Josh:
how would you kind of sell it to me?
Logan:
The best image generation and editing model in the world, 2.5 flash image or
Logan:
nano banana, whichever you prefer, is the model that can do this.
Logan:
And I think there's so many creative use cases where you're actually bounded
Logan:
by the creative tool. And I feel like this is one of these examples to me where
Logan:
it's like, I feel like I'm 10x more.
Logan:
I was literally helping my friend yesterday doing a bunch of iterations on his
Logan:
LinkedIn picture because it was like, you know, the background was slightly
Logan:
weird or something like that.
Logan:
And we were just like, I did like 15 iterations and now he's got a great new
Logan:
LinkedIn background, which is awesome.
Logan:
So like, there's so many like actual practical use cases where you,
Logan:
and I literally just like built a custom tool on the fly vibe coding in order
Logan:
to solve that use case, which was a ton of fun.
Josh:
Yeah, this is so cool. Okay, so this model, Nano Banana Gemini 2.5 Flash Image Gen, it's out today.
Josh:
So we'll link that in the description for people who want to try it out.
Josh:
I think one of my complaints for the longest time, and I've mentioned this on
Josh:
the show a few times, is a lot of times when I'm engaging with this incredible
Josh:
form of intelligence, I just have a text box.
Josh:
And it's up to me to kind of pull the creativity out of my own mind.
Josh:
And I don't get a lot of help along the way. but one of the things that you
Josh:
spend your time in is this thing called Google AI Studio.
Josh:
And I've used AI Studio a lot because it solves a problem for me that was annoying,
Josh:
which is just the blank text box.
Josh:
It kind of has a lot of prompts. It has a lot of helpers. It has a lot of guidance
Josh:
into helping me extract value out of the model.
Josh:
So what I'd love for you to do for people who aren't familiar,
Josh:
Logan, is just kind of explain to everyone what Google AI Studio is and why
Josh:
it's so important and why it's so great.
Logan:
Yeah, I love this, Josh. I appreciate that you like using AI Studio.
Logan:
Go. It is a labor of love. Lots of
Logan:
people across Google have put in a ton of time to make progress on this.
Logan:
I really want to, so I'll make a caveat, which is we have this entirely redesigned
Logan:
AI Studio experience that's coming very soon.
Logan:
I won't spoil it in this episode because it's like half faked right now.
Logan:
And I wish I could show, and I think actually some of the features that you
Logan:
might see in this UI might be slightly different at launch time than what you see here.
Logan:
So take this with a grain of salt. We've got a bunch of new stuff coming.
Logan:
And I think actually it should help with this problem that you're describing,
Logan:
which is as you show up to a bunch of these tools today, the onus is really
Logan:
on you as a user to like try to figure out what's capable,
Logan:
what all the different models are capable of, what even are all the different
Logan:
models, like all of that stuff.
Logan:
So at a high level, like we built AI Studio for this like AI builder audience.
Logan:
If you want to take AI models and actually build something with them and not
Logan:
just, you know, chat to AI models, this is the product that was built for you.
Logan:
We have a way to, in this like chat UI experience, sort of play with the different
Logan:
capabilities of the model, feel what's possible. What is Gemini good at? What's it not good at?
Logan:
What are the different tools it has access to? But as you go into iStudio,
Logan:
you'll see something that looks like this.
Logan:
You know, we're highlighting a bunch of the new capabilities that we have right now.
Logan:
This URL context tool, which is really great for information retrieval,
Logan:
this native speech generation capability, which is really cool.
Logan:
Folks have used notebook lm and you want to build a
Logan:
notebook lm like experience um we have
Logan:
an api for for people who want to build something like that and
Logan:
we have this live audio to audio dialogue experience where
Logan:
you can share a screen with the model and talk to it and it can see the things
Logan:
that you see and engage with it of course we have our native image generation
Logan:
and editing model the old version 2.0 flash now the new version 2.5 flash um
Logan:
and lots of other stuff that's available as you sort of experience what these models are capable of.
Logan:
Um, so really this playground experience is one version. We have this chat prompt
Logan:
on the left-hand side. We have this stream.
Logan:
This is where you can talk to Gemini and sort of share your screen.
Logan:
And, um, actually you can like show it things on the webcam and be like, what's this?
Logan:
How do I use this thing? You can do this on mobile as well, which is really cool.
Logan:
We have this generative media experience where like, if you want to build things
Logan:
with, we have a music model, we have a VO, which is our video generation model.
Logan:
We have all the text to speech stuff, which is really cool.
Logan:
As I overwhelm people with so much stuff that you can do in AI Studio.
Logan:
The sort of key threat of all this is we built AI Studio to showcase a bunch of these capabilities.
Logan:
And everything you see in AI Studio has an underlying sort of API and developer experience.
Logan:
So if you want to build something like any of these experiences, all of this is possible.
Logan:
There's like no Google secret magic that's happening pretty much anywhere in AI Studio.
Logan:
It's all things that you could build as someone uh using a
Logan:
vibe coding product or you know by hand writing the
Logan:
code um you could build all these things and even more
Logan:
um and that is the perfect segue to this
Logan:
build tab where we're trying to help also you know actually help you get started
Logan:
building a bunch of stuff so you can use these templates that we have you can
Logan:
use a bunch of the suggestions you can look through our gallery of a different
Logan:
stuff um and we're really in this experience trying to help you build ai powered
Logan:
apps which we think is something that folks are really really excited about and,
Logan:
we'll have much more to share around all the ai app building stuff uh in the near future.
Josh:
Awesome thanks for the rundown so as i'm looking at this i'm wondering who do
Josh:
you think this is for what type of person should come to ai studio and tinker around here
Logan:
Yeah so i think you know historically and and so you'll see a little bit of
Logan:
this transition if you play around the product where there's some interesting
Logan:
edges we were originally focused on building for developers So it was built
Logan:
and there is like a part of the experience which like is tied to the Gemini
Logan:
API, which tends to be used mostly by developers.
Logan:
So if you go to dashboard, you can see all your API keys and check your usage
Logan:
and billing and things like that. By and large, though, I think the really cool
Logan:
opportunity of what's happening right now is this transition of like who is creating software.
Logan:
And this is, I think, principally enabled by Vibe Coding.
Logan:
And because of that, like we've recentered ourselves to be really focused on
Logan:
this AI builder persona, which is like people who want to build things using AI tools.
Logan:
Also, people who are trying to build AI experiences, we think is going to be
Logan:
the market that creates value for the world.
Logan:
So if you're excited about all the things that you're seeing,
Logan:
if you want to build things, AI Studio is very much like a builder first platform.
Logan:
If you're just looking for like a great everyday AI assistant product,
Logan:
you, you know, want to get help on coding questions or homework or life advice
Logan:
or all that type of stuff,
Logan:
the Gemini app is the right place for this. It's very much like a.
Logan:
DAU type of product where like you come back and it has memory and personalization
Logan:
and all this other stuff, um, which makes it really great as like an,
Logan:
uh, as a, as an assistant to help you in your life versus AI studio.
Logan:
The artifact is like, we help you create something and then you go put that
Logan:
thing into the world in some way.
Logan:
Um, and you don't necessarily need to come back and use it every day.
Logan:
You use it whenever you want to build something.
Ejaaz:
It's funny. Um, I'm dating myself a bit here,
Ejaaz:
but I remember when I first booted up a PC and I loaded up Microsoft Office
Ejaaz:
and I just had access to all these different wonderful applications that were
Ejaaz:
at the time super new or within one suite.
Ejaaz:
This kind of feels like that moment for AI.
Ejaaz:
And you might not take that as a compliment because it's a completely different
Ejaaz:
company, but it was what I built my childhood off of and my fascination with computers.
Ejaaz:
So I appreciate this and I love that it's this massively like cohesive experience.
Ejaaz:
But kind of zooming out, Logan, I was thinking a lot about Google AI and what
Ejaaz:
that means to me personally.
Ejaaz:
I have to say it's the only company that I think beyond an LLM.
Ejaaz:
And what I mean by that is when I think of Google AI, I don't just think of Gemini.
Ejaaz:
I think of the amazing image gen stuff that you have.
Ejaaz:
I think of the amazing video outputs that you guys have. I think of the text-to-voice
Ejaaz:
generation that you just demoed and all those kinds of things.
Ejaaz:
I remember seeing this advert that appeared on my timeline.
Ejaaz:
And I remember thinking, wow, this must be the new GTA. Then I was like,
Ejaaz:
no, no, that's Florida. That's Miami.
Ejaaz:
No, people are doing wild stuff. That's an alien. Hang on a second. This can't be real.
Ejaaz:
And then I learned that it was a Google VO3 generation of an advert for Kalshi,
Ejaaz:
which is like this, you know, prediction markets situation.
Ejaaz:
And I remember thinking, how on earth have we got to AI generated video that
Ejaaz:
is this high quality and this high fidelity?
Ejaaz:
I think in my mind, VO3 has kind of like killed the VFX studio.
Ejaaz:
It's kind of killed a lot of Hollywood production studios as well.
Ejaaz:
Give me a breakdown and insight into how you built or how you guys built VO3
Ejaaz:
and what that means for the future of movie video production and more
Logan:
Yeah that's a great question i think there's something really interesting along
Logan:
these threads and and not to not to push back on the notion that it's killing
Logan:
hollywood because i think i think there is like um,
Logan:
I think it's an interesting conversation. The way that I have seen this play
Logan:
out and the great example of this, that folks have seen Flow,
Logan:
which is our sort of like creative video tool.
Logan:
And if you're using VO and you want to sort of get the most out of VO,
Logan:
Flow is the tool to do that.
Logan:
If you see lots of like the creators who are building, you know,
Logan:
minute long videos using VO and it's like this really cohesive story and it
Logan:
has like a clear visual identity, similar to what you'd get from like a.
Logan:
Probably not the extent of a Hollywood
Logan:
production, but like somebody thoughtfully choreographing a film.
Logan:
Flow is the product to do that. And actually interesting, like Flow was built
Logan:
in conjunction with filmmakers.
Logan:
And I think that's actually like there is, and I feel this way about vibe coding as well.
Logan:
And it's this thought experiment that I'm always running through in my head,
Logan:
which is, yes, I think AI is like raising the bar forever or it's raising the
Logan:
floor for everyone. We're like, now everyone can create.
Logan:
What does that mean for people who have expertise?
Logan:
And I think in most cases, what it means is actually the value of your expertise
Logan:
continues to go up. And like, this is my personal bet.
Logan:
And I don't know how much this tracks to like everyone else's worldview.
Logan:
My personal bet is that expertise in the world where the floor is lifted for
Logan:
everyone across all these dimensions is actually more important because there was something about,
Logan:
and I think like video production is a great example for me because I would
Logan:
never have been able to make a video.
Logan:
Like it's not in the cards, like for my skillset, my creative ability,
Logan:
my financial ability, like I will never be able to make a video.
Logan:
I can make things with VO. Um, and now I'm like a little bit closer to imagining
Logan:
like, okay, if I'm serious about this, I need to go out and like actually engage with people.
Logan:
And I've like, sort of, it's like whetted my appetite in a way that I don't
Logan:
think I would. It was just like too far in a way.
Logan:
And I think software is another example where Vibe Coding, if you were to pull
Logan:
a random person off the street and you start talking to them about coding and
Logan:
seeing C++ and deploying stuff and all this, they're like, brain turns off, not interested.
Logan:
I don't want to learn to code. That's not cool. It's not fun. It sounds horrible.
Logan:
And then Vibe Coding rolls around and it's like, oh, wait, I can actually build
Logan:
stuff. And like, yeah, I don't really need to understand all of the details.
Logan:
But there's still a limit to what I can build and who is actually well positioned
Logan:
to help me take the next step.
Logan:
Like I, you know, vibe code something. I'm like, this is awesome.
Logan:
I share with my friends. They all love it.
Logan:
I want to, you know, go build a business around this thing that I vibe coded.
Logan:
There's still a software engineer that needs to help make that thing actually
Logan:
happen. So if anything, it's like it's increasing this.
Logan:
I mean, on the software side, there's this infinite demand for software,
Logan:
and it's increasing the total addressable market of like what software engineers
Logan:
need to help people build.
Logan:
I think there'll be something similar on the video side.
Logan:
You know, there will be downsides to AI technology in some ways.
Logan:
I think there is like as the technology shift happens, there is some amount
Logan:
of disruption that's taking place.
Logan:
And like someone's workflow is being disrupted. But I do think there's this
Logan:
really interesting thread to pull on, which is my hope is that it actually ends
Logan:
up creating more opportunity for the experts and the specialists.
Ejaaz:
So it sounds like you're not saying VFX Studio teams are going to be replaced by software engineers,
Ejaaz:
but rather that team in itself will become more adept at using these AI tools
Ejaaz:
and products to kind of enhance their own skill set beyond what it is today. Is that right?
Logan:
Yeah, yeah. And I think we've seen this already play out in some ways, which is interesting.
Logan:
I think code is a little bit wider distribution than perhaps the VFX.
Logan:
And it's VFX also in a space that I'm less familiar with personally.
Logan:
But yeah, I think this is likely what is going to play out if I had to guess and bet.
Ejaaz:
Can you help us understand how a product like VO3 gets used beyond just like
Ejaaz:
the major Hollywood productions?
Ejaaz:
Stuff, right? Because I've seen a bunch of these videos now.
Ejaaz:
And I'll be honest with you, Logan, it's scary how realistic this stuff is, right?
Ejaaz:
It's like from a high quality triple A game demo, all the way to something that
Ejaaz:
is shot like in an A24 film, you know, the scenes, the cuts, the changes.
Ejaaz:
I think it's awesome. I'm wondering whether that goes beyond entertainment in
Ejaaz:
any way. Do you have any thoughts or ideas there?
Logan:
Yeah, that is interesting. I think one of the ones that is like related to,
Logan:
it's sort of one skip away from video generation itself, which was Genie,
Logan:
which was our sort of world simulation work that was happening.
Logan:
I think if folks haven't seen this, go look up Genie 3 and you can see a video.
Logan:
It's mind blowing. It's like a fully playable game world simulation.
Logan:
You can like prompt on the go and this environment will change.
Logan:
You can control it on your keyboard similar to a game. I think that work translates
Logan:
actually really well to robotics, which is cool.
Logan:
So as you like one of the if folks aren't familiar with this,
Logan:
like one of the principal reasons we don't just have robots walking around everywhere.
Logan:
And the reason why we have LLMs that can actually do lots of useful stuff is
Logan:
it's this data problem, which is like there's lots of, you know,
Logan:
text data and other data that's like representative of the intelligence of humans
Logan:
and all this stuff that's available.
Logan:
There's actually not a lot of data that is useful for making robotics work.
Logan:
And I think VO could be part of, or like generally that sort of segment of video
Logan:
generation and this like physics understanding and all that other stuff,
Logan:
I think could be really helpful in actually making the long tail of robotics use cases work.
Logan:
Then I can finally have a robot that will fold my laundry so that I don't need
Logan:
to spend my time doing that.
Logan:
But that's my like outside of entertainment bet as far as like where that use
Logan:
case ends up creating value in the world.
Ejaaz:
With VO3, the goal is to enable humans to become a better version of themselves,
Ejaaz:
a 10x, 100x better version of themselves using these different tools.
Ejaaz:
So in the example of a VFX studio, you can now kind of like create much better movies.
Ejaaz:
How does that apply for Genie 3 exactly, right?
Ejaaz:
You gave the example of like being able to create simulated environments,
Ejaaz:
but that's to train these robots. That's to train these models.
Ejaaz:
What about us? What about the flesh humans that are out there?
Ejaaz:
Can you give us some examples about where this might be applied or used?
Logan:
Yeah, that's a good example. I mean, the robot answer is like the robots will
Logan:
be there to help us, which is nice.
Logan:
So hopefully there's a bunch of stuff that you don't want to do that you'll
Logan:
be able to get your robot to do.
Logan:
Or there's like industries that are like dangerous for humans to operate in
Logan:
where it's like if you can sort of do that simulation without needing to collect
Logan:
a bunch of human data to do those things, I could see that being super valuable.
Logan:
I think my initial reaction to the Genie use case, like I could see lots of,
Logan:
actually, the two that come to mind is like one entertainment I think will be cool.
Logan:
Humans want to be entertained. It's a story as all this time.
Logan:
I think there will be some entertainment value of a product experience like Genie.
Logan:
Um, I think the other one is actually back to a bunch of use cases where you'd
Logan:
actually want robotics to be able to do some of that work that don't yet, uh,
Logan:
the robot product experience, like isn't actually there.
Logan:
Um, this could be things like, you know, mining or like heavy industries,
Logan:
things like that, where like, there's actually like a safety aspect of like,
Logan:
how can you do these like realistic simulation training experiences, um,
Logan:
in order to make sure that like you're You don't have to like physically put
Logan:
yourself in harm's way in order to like understand the bounds or like the failure cases,
Logan:
like disaster recovery, things like that, where it would be you don't want to
Logan:
have to show up at a hurricane the first time to like really understand what
Logan:
the environment could be like.
Logan:
And like being able to do those types of simulations is interesting and building
Logan:
software deterministically to solve that problem would actually be really difficult
Logan:
and expensive and like probably isn't a large market that lots of companies are going to go after.
Logan:
But if you have this model that has really great world knowledge,
Logan:
you can throw all these random variables at it and like sort of do that type
Logan:
of like training and simulation.
Logan:
So yeah, it's perhaps an interesting use case.
Logan:
I don't know if there's actually a plan to use it for things like that,
Logan:
But those are things that come to mind.
Josh:
This is something I've been dying to ask you about because this is something
Josh:
that I've been fascinated by.
Josh:
When I watched the Genie 3 demo for the first time, it just kind of shattered
Josh:
my perception of where we were at because you see it work.
Josh:
And I saw this great demo where someone was painting the wall.
Josh:
We actually filmed an entire episode about this and it retained all of the information.
Josh:
And one theme, as I'm hearing you describe these things, as I'm hearing you
Josh:
describe VO3, Genie 3, you are building this deep understanding of the physical world.
Josh:
And I can't help but notice this trend like you are just starting
Josh:
to understand the world more and more and I could see this when
Josh:
it comes to making games as an example where like a lot
Josh:
of people were using genie 3 to just make these like not necessarily
Josh:
games but virtual worlds that you can walk around and interact with and I'm
Josh:
wondering if you could just kind of share the long-term reasoning why because
Josh:
clearly there's a reason there's a lot of value to it is it from being able
Josh:
to create maybe artificial data for robots if you can emulate the physical world
Josh:
you can create data to train these robots is it because it creates great experiences
Josh:
like perhaps we'll AAA design studios using Genie 5 to make AAA games like Grand Theft Auto.
Josh:
I'm curious the reasoning behind
Josh:
this like urge to understand the physical world and emulate it even.
Logan:
I had a conversation with Demis about this who's our CEO at DeepMind and someone
Logan:
who's been pushing on this for a long time.
Logan:
I think a lot of this goes back to like there's two dimensions.
Logan:
It goes back to like the original ethos of like why DeepMind was created and
Logan:
a bunch of the work the initial work that was happening in DeepMind around reinforcement learning.
Logan:
If folks haven't seen this, one of the challenges of, again,
Logan:
making AI work is that you need this flywheel of continuing to iterate and you
Logan:
need a reward function, which is what is the actual outcome that you're trying to achieve?
Logan:
And the thing that's interesting about these simulated environments is it's
Logan:
really easy to have a constrained environment.
Logan:
World, and it's really easy to also, or not maybe really easy,
Logan:
is overly ambitious. It's possible to define a simple reward function and then
Logan:
actually infinitely scale this up.
Logan:
And the opposite example of this, if folks have saw there was some work a very
Logan:
long time ago, and this is in the AI weeds, but there was this hand,
Logan:
this physical hand that could like robotic hand that could manipulate a Rubik's cube.
Logan:
And they were using AI to like help try to solve this Rubik's cube.
Logan:
And the, the, again, the analogy of why this, of why Genie and some of this
Logan:
work is so interesting is if you were to go and try to like,
Logan:
Hey, we need all the data to go and try to make this little hand,
Logan:
physical robotic hand, be able to do this.
Logan:
It's actually really challenging to scale that up. You need to go and build a bunch of hands.
Logan:
You need to like, what happens when the rubik's cube drops you need
Logan:
to have some system to like go and pick it back up and you just like go
Logan:
through the long tail of this stuff the hand probably can't run
Logan:
24 hours a day like there's all these challenges with getting um the the like
Logan:
data in that environment to scale up um and these virtual environments don't
Logan:
have this problem which is if you can emulate and like self-driving cars is
Logan:
another example of this like again for folks who aren't familiar lots of you You know,
Logan:
there's lots of real world data that's involved in self-driving cars.
Logan:
There's also lots of simulated environments where they've built simulations of the world.
Logan:
And this is how they can get like a thousand X scale up of this like data understanding
Logan:
is by having these simulated environments.
Logan:
Robotics will be exactly the same. If you want robotics to work,
Logan:
it's almost 100% true that you're going to have to have these simulated environments
Logan:
where the robot can fall down the stairs a thousand times.
Logan:
And that's okay because it's a simulated environment and it's not actually going
Logan:
to fall down your stairs.
Logan:
So I think Genie is, there is definitely like an entertainment aspect to it.
Logan:
I think it's more so going to be useful for this like simulated environment
Logan:
to help us not have to do things in the real world and, but still have like
Logan:
a really good proxy of what will happen in the real world when we do them.
Ejaaz:
That's pretty funny. I spent the weekend watching the World Robot Olympics and
Ejaaz:
there was some very real fails and crashes of these robots, which is pretty funny.
Ejaaz:
Okay, so when I think of Genie, I think that it blows my mind because I still
Ejaaz:
can't get my head around how it predicts what I'm going to look at.
Ejaaz:
I remember seeing this demo of someone just taking a simple video of them walking and,
Ejaaz:
you know, it was like a rainy day on a gravel path and they stuck that into
Ejaaz:
Genie 3 and they could look down and see their reflection in the puddle.
Ejaaz:
So the physics was astoundingly accurate and astute.
Ejaaz:
Can you give us a basic breakdown of how this works?
Ejaaz:
Is this like a real engine, game engine, like happening in the background?
Ejaaz:
Or is there something more deeper happening? Like help us understand.
Logan:
My intuition, and we can gut check this with folks on the research side to make
Logan:
sure that I'm not, I'm not fabricating my intuition.
Logan:
But if folks have an intuition as far as like how next token prediction works,
Logan:
which is at some given, like if you're looking through a sentence of text,
Logan:
for each word in that sentence, there's a
Logan:
distribution uh between like zero
Logan:
and one basically of like how likely that
Logan:
word was to be the next word in the sequence um and
Logan:
if you look the and if you like look through this is like the basic
Logan:
principle of llms um this is why you get like the you know if you're to ask
Logan:
the same question multiple times um the the llm will inherently perhaps give
Logan:
you a different answer And that's why like small changes in the inputs to LLMs
Logan:
actually change this because like, again, it's this distribution.
Logan:
So like if you make one letter difference, it perhaps like puts you on a like
Logan:
a branching trajectory that looks very different than what the original output
Logan:
that you got from the model.
Logan:
Similar similar like rough approximation of this just like much more computationally
Logan:
difficult and i think they use a bunch of um architectural differences that
Logan:
sort of it's not truly next token prediction that's happening for the sort.
Ejaaz:
Of like pixels colors bunch of other things yeah
Logan:
Exactly yeah so it's like you can like roughly map
Logan:
the mental model of like as the as a model
Logan:
looks down or as like the figure looks down
Logan:
in some in some environment like again it has
Logan:
all this like context of the state of the world but then it also knows like
Logan:
what are the pixels that are preceding it etc etc it like loosely is doing this
Logan:
like next next pixel prediction you could you could sort of approximate with
Logan:
um that's happening at the at the genie level which is which is an interesting way to think about it.
Josh:
So ijaz one of the things you were mentioning was that um it's happening much
Josh:
faster right and it's happening presumably much cheaper because now i heard this crazy stat.
Josh:
You're at like 500, hundreds of trillions of tokens per month that is being
Josh:
pushed out by Gemini. It's unbelievable.
Josh:
And I want to get into the kind of infrastructure that enables this because
Josh:
Gemini is feeling faster, but it's also feeling better. And it's also getting cheaper.
Josh:
And behind you earlier in the show, you mentioned you have a TPU.
Josh:
I understand TPUs are part of this solution. And I want you to kind of just
Josh:
walk us through how this is happening.
Josh:
How are we getting these quality and improvements across the board?
Josh:
And what type of hardware or software is enabling that to happen.
Logan:
I think like one, you have to give credit to like all of these infrastructure
Logan:
teams across Google that are making this happen.
Logan:
If you think, and I think about this a lot, like what is Google's differentiated advantage?
Logan:
What does our expertise lend us well to do in the ecosystem?
Logan:
What are the things we shouldn't do because of that? What are the things we
Logan:
should do because of that? It's something I think about as somebody who builds products.
Logan:
One of the things that I always come back to is our infrastructure.
Logan:
And like the thing Google has been able to do time and time again.
Logan:
Scale up multiple products to billions
Logan:
of users, have them work with high reliability, et cetera, et cetera.
Logan:
And that's like a uniquely difficult problem. It's a even more difficult problem
Logan:
to do in the age of AI where like the software is not deterministic.
Logan:
The sort of compute footprint required to do these things is really difficult.
Logan:
The models are a little bit tricky and finicky to work with sometimes.
Logan:
So again, like our infrastructure teams have done an incredible job making that scale up.
Logan:
I think the stat was IO 2024, we were doing roughly 50 trillion tokens a month.
Logan:
IO 2025, I think it was like 480 trillion tokens a month, if I remember correctly.
Logan:
And just a month or two later, and this was in the conversation I had with Demis,
Logan:
we crossed a quadrillion tokens, which comes after a trillion,
Logan:
if you're not haven't thought about numbers higher than a trillion before
Logan:
it's what comes after a trillion um and there's
Logan:
no slowdown in sight and like i think this is just a great
Logan:
reminder of like um so many
Logan:
of these ai like markets and
Logan:
product ecosystems is still so early and there's
Logan:
this massive expansion i think about in my own life like how
Logan:
much ai do i really have in my life helping me like not
Logan:
really that much on the margin it's like you know maybe
Logan:
tens of millions of tokens a
Logan:
month maximum and like you think
Logan:
about a future where there's like billions of tokens being spent on a monthly
Logan:
basis in order to help you in whatever you're doing in your professional life
Logan:
and your work and your personal life whatever it is there's we're still so early
Logan:
and tpus are a core part of that because it allows us to like um control every
Logan:
layer of the hardware and software.
Logan:
Delivery all the way to the actual like silicon that the model is running on
Logan:
and we can do a bunch of optimizations and customizations,
Logan:
other people can't do because they don't actually control the hardware itself
Logan:
and there's some good examples of the things that this enables
Logan:
one of them is um you know we've been at the Pareto
Logan:
frontier from a cost performance perspective for
Logan:
a very long time and again if folks aren't familiar the Pareto frontier is this
Logan:
like trade-off of cost and intelligence and you want to be on the highest intelligence
Logan:
lowest cost um and we've been sitting on that for for you know basically the
Logan:
entirety of the Gemini life cycle um so far which is really important so people
Logan:
get a ton of value from the Gemini models.
Logan:
Another example of this is long context. Again, if folks are familiar,
Logan:
there's a limit on like how many tokens you can pass to a model at a given time.
Logan:
Gemini's had a million or 2 million token context windows since the initial
Logan:
launch of Gemini, which has been awesome.
Logan:
And there's a bunch of research showing we could scale that all the way up to
Logan:
10 million if we wanted to.
Logan:
And that is like a core infrastructure enabled thing.
Logan:
Like research, There's a lot of like really important research to make that
Logan:
work and make that possible.
Logan:
But it's also really difficult on the infrastructure side. And you have to be
Logan:
willing to do that work and pay that price.
Logan:
And it's a beautiful outcome for us because we have the infrastructure teams
Logan:
that have the expertise to do this.
Josh:
Okay, Logan, one quadrillion tokens. That's a big number.
Josh:
We need to talk about this for a little bit because that is an outrageously,
Josh:
mind-bendingly big number.
Josh:
And when I hear you say that number, I think I'm reminded of Jevin's Paradox
Josh:
for people who don't know. it's increased technological efficiency in using
Josh:
a resource which can lead to higher total consumption of that resource.
Josh:
So clearly with these cool new TPUs, this vertically integrated stack you've
Josh:
built, you are able to generate tokens much more cheaply and produce a lot more
Josh:
of them. Hence the one quadrillion tokens.
Josh:
Do you see this trend continuing? Is there going to be a continued need to just
Josh:
produce more tokens or will it eventually be a battle to produce smarter tokens?
Josh:
I guess the question I'm asking is the quality of the token more important than
Josh:
the amount of the tokens?
Josh:
And do you see a limit in which the quantity of the tokens starts to like kind
Josh:
of go off of a cliff in terms of how valuable it is?
Logan:
Yeah, I could buy that story. And some of this is and it's something that's
Logan:
actually super top of mind for our teams on the like Gemini model side is around
Logan:
this whole idea of like thinking efficiency,
Logan:
which is like ideally you want
Logan:
to get to the best answer using the limited amount of thoughts possible.
Logan:
Same thing with humans like ideally like you're the example
Logan:
of like you're taking a test you want to as you know
Logan:
the shortest number of mental hops possible to get you to the answer.
Logan:
Of whatever the question was is ideally what you want you don't want
Logan:
to have to just like think for an hour to answer one question
Logan:
um and there's yeah there's a bunch
Logan:
of odd parallels in that world to like models and
Logan:
and humans doing this approach um so i
Logan:
do think thinking efficiency is top of mind you don't want to just
Logan:
like use tokens for the sake of tokens um i
Logan:
think even if we were to like 10x reduce the
Logan:
number of tokens required which would be like awesome and
Logan:
would be like a great innovation the models are like much more
Logan:
token efficient i do think there's like a um a
Logan:
pretty low ceiling to how far that will be able to go
Logan:
specifically because of this like next
Logan:
token prediction paradigm of like how the models
Logan:
actually approach solving problems using using
Logan:
like the token as a unit so it's
Logan:
not clear to me that you'll be able to just like you know a thousand x reduce
Logan:
the amount of tokens required to solve a problem i think it probably looks much
Logan:
more like 10x or something like that and then the there'll be a 10x reduction
Logan:
in the number of tokens required to solve a problem and there'll be a 10 000
Logan:
x increase in the total amount of ai and and sort of token consumption in the world.
Logan:
So I think you probably, even if we made that reduction happen,
Logan:
I think the graph still looks like it's going up and to the right for the most part.
Josh:
It still keeps going. There is no wall. We have virtual data to train models
Josh:
on. We have tons of new tokens coming into play.
Josh:
There's another question I wanted to ask, which is just a personal question
Josh:
for you, which is a feature that, because I find when a lot of people leave
Josh:
comments on the show and they talk about their experience with AI,
Josh:
a lot of them are just using like ChatGPT on their app or they have Grok on their phone.
Josh:
And I think Gemini kind of has some underrated features that don't quite get enough attention.
Josh:
So what I'd like for you to do is maybe just highlight one or two of the features
Josh:
you shipped recently that you think is criminally underrated.
Josh:
What should people try out that you think not enough people are using?
Logan:
I think the one that continues to surprise me the most is deep research.
Logan:
I think deep research is just like a, is the North Star for building an AI product experience.
Logan:
And if folks aren't familiar with this, so you can show up with,
Logan:
yeah, it's so, you can show up with like a pretty ill-defined question that's
Logan:
like very open and vague.
Logan:
And the model will traverse essentially across the internet hundreds or thousands
Logan:
of different web pages, try to accumulate enough context and then come back to you with initially,
Logan:
basically like a research report, could be like a 40 page report in some cases that I've seen.
Logan:
You might hear a 40 page report and say, that's not very useful to me because
Logan:
I'm not going to read 40 pages.
Logan:
And I'd say you and me are exactly the same because I'm not reading 40 pages either.
Logan:
There's a beautiful feature, again, if you've used Notebook LM,
Logan:
this audio overviews feature.
Logan:
The same thing actually exists inside of the Gemini app with deep research,
Logan:
which you can just press that button and then get like a, you know,
Logan:
10, 15 minute podcast that sort of goes through and explains all the different
Logan:
research that's happened.
Logan:
You can, you know, listen to that on your commute or something like that or
Logan:
on a walk and not need to read 40 pages, which is awesome.
Logan:
The part of this that makes it such an interesting experience to me is,
Logan:
I don't know if other people have felt this before, but most AI products back
Logan:
to that, Josh, that like blank slate problem or that like empty chat box problem.
Logan:
You, you as the user of the product have to put in so much work in order to get useful stuff.
Logan:
I talk to people all the time who are like, yeah, I use these models and like,
Logan:
they're just not useful for me. And like, actually what's happening behind the
Logan:
scenes is the models are super capable.
Logan:
They're really useful. It just requires that you give the models enough context.
Logan:
And I think deep research, there's this new emerging like prompt engineering
Logan:
2.0 is this context engineering problem where it's like, how do you get in the
Logan:
right information so that the model can make a decision on behalf of the user?
Logan:
And I think deep research is this really nice balance of going and doing this
Logan:
context engineering for you,
Logan:
bringing all that context into the window of the model, and then
Logan:
being able to answer what your original question was and principally showing
Logan:
you this like proof of work up front I think about this proof of work concept
Logan:
in AI all the time which is I have so much more trust in deep research because
Logan:
as soon as I kick off that query it's like boom it's already at like 50 web
Logan:
pages I'm like great because.
Logan:
I was never going to visit 50 webpages. Like there's pretty much nothing that I'm researching.
Logan:
I could be going and buying a car and I'm going to go and look at less than
Logan:
50 webpages for that thing or a house.
Logan:
I'm looking at less than 50 webpages. Like I'm just, it's not in the car.
Logan:
At least this is maybe personal to me and other people are doing more research. I don't know.
Logan:
But so automatically I'm like in awe with how much more work this thing is doing.
Logan:
And I think there's, this is, again, this is the North Star from an AI product
Logan:
experience standpoint. and there's so few products that have like made that
Logan:
experience work and just every time I go back to deep research I'm reminded
Logan:
of this and that team uh crushed it so.
Ejaaz:
And it's not just deep research from a LLM context that is so fascinating about
Ejaaz:
Google AI um you guys have created some of the most fascinating tools to advance
Ejaaz:
science and I don't think you guys get enough flowers for what you guys have built.
Ejaaz:
Some of my favorites, AlphaFold 3 is crazy.
Ejaaz:
So, you know, this is this model
Ejaaz:
that can predict what certain molecular structures are going to look like.
Ejaaz:
And this could be applied to so many different industries, the most obvious
Ejaaz:
being drug design, creating cheaper,
Ejaaz:
more effective, curable drugs for a variety of different diseases.
Ejaaz:
And then I was thinking about that random model that you guys launched,
Ejaaz:
where apparently we could translate what dolphins were saying to us and vice versa um
Ejaaz:
Kind of stepping back from all of these examples, can you help me understand
Ejaaz:
what is Google's obsession with AI and science and why you think it's such an
Ejaaz:
important area to focus on?
Ejaaz:
Are we at a point now where we can advance science to infinity or where are we right now?
Ejaaz:
Are we at our chat GPT moment or do we have more to go?
Logan:
I'll start with a couple of cheeky answers, which Demis, who is the only foundation
Logan:
model lab CEO to have a Nobel prize, uh, in the science domain,
Logan:
which is, uh, for him, for him, chemistry, um,
Logan:
had this comment, which is actually really true.
Logan:
There's lots of people talking about this, like impact of AI on science and humanity.
Logan:
Um, and there's very few, uh, if not only one, um, being deep mind research
Logan:
lab, that's like actually doing the science work.
Logan:
And I think it's this like great example of like deep mind and just being in
Logan:
the like culture and DNA, DNA of like Demis is a scientist,
Logan:
all of these folks around, um, around DeepMind are scientists and they like
Logan:
want to push the science and, and push what's possible in this,
Logan:
this future of discovery,
Logan:
um, using our models.
Logan:
And I was in London a couple of weeks ago meeting with Pushmid who leads our
Logan:
science team and hearing about sort of like the breadth of the science that's happening.
Logan:
Um, and how like Dolphin Gemma is like a great, like kind of like funny example,
Logan:
cause it's, it's not super applicable in a lot of cases, but it's interesting
Logan:
to think about, um, alpha fold, like if, if folks haven't, um.
Logan:
Watched the movie, the thinking game, it's about sort of the early days of,
Logan:
of, uh, Google deep mind.
Logan:
And, um, they're talking about like folding proteins and why this is such an interesting space.
Logan:
And I'm not a, not a scientist, um, but the, to like hit on the point really
Logan:
quickly of like why Alpha Fold is so interesting. The the
Logan:
historical like context is humans to fold
Logan:
a single protein would take many humans millions
Logan:
of dollars and it would take on the
Logan:
order of like five years in order to fold a single protein the original impetus
Logan:
and like why demas won the nobel piece the nobel prize for this in chemistry
Logan:
was because deep mind was able to figure out using uh reinforcement learning
Logan:
and other techniques They folded every protein in the known universe,
Logan:
millions of proteins, released them publicly,
Logan:
made them available to everyone.
Logan:
And it was like, you know, dramatically accelerated the advancement of like
Logan:
human medicine and a bunch of other domains and disciplines.
Logan:
And now actually with isomorphic labs, which is part of DeepMind,
Logan:
like actually pursuing some of the breakthroughs that they found and like actually
Logan:
doing drug discovery and things like that.
Logan:
So like overnight, you see that hundreds of thousands of human years and hundreds
Logan:
of millions of dollars of like research and development costs saved through a single innovation.
Logan:
And I think we're going to continue to see that like acceleration of new stuff happening.
Logan:
A recent example of this, Alpha Evolve,
Logan:
which was our sort of like geospatial model that came out and being able to
Logan:
like fuse together all of this, the Google Earth engine with AI and this understanding of the world.
Logan:
Like it's just so much cool science and so much is possible when you sort of
Logan:
layer on the AI capability in all these disciplines. So I think to answer the
Logan:
question, I think we're going to see this acceleration of science progress.
Logan:
I think DeepMind is going to continue to be at the forefront of this,
Logan:
which is really exciting.
Logan:
And the cool thing for even for people who aren't in science is all of that
Logan:
innovation and like the research breakthroughs that happen, it feeds back to
Logan:
the mainline Gemini model.
Logan:
Like we had a bunch of research work about doing proofs for math.
Logan:
And it's like, oh, that's not very interesting at the face value. But like that research.
Logan:
Fuels back into the mainline gemini model it makes
Logan:
it better at reasoning it makes it better able to like understand these like
Logan:
really long and difficult problems um which then benefits like every like agent
Logan:
use case that exists uh because the models are better at reasoning through all
Logan:
these like difficult problem domains so there is this like really cool research to reality,
Logan:
science to like practical impact flywheel that happens at deep mind as.
Ejaaz:
A former biologist this warms my heart. This is amazing to see this get applied at such scale.
Ejaaz:
Okay, we can't talk about Google AI without talking about search.
Ejaaz:
This is your bread and butter, right?
Ejaaz:
However, I've personally noticed a trend shift in my habits.
Ejaaz:
I've used a computer for decades now, and I've always used Google search to
Ejaaz:
find things, Google Chrome, whatever it might be.
Ejaaz:
But I've now started to cheat on this feature.
Ejaaz:
I have started using LLMs directly to do all my searching for me,
Ejaaz:
to get all my sources for me.
Ejaaz:
And you've got to be thinking about this slogan, right? Is this eating the search business?
Ejaaz:
Is this aiding the search business? Or are we creating a whole different form
Ejaaz:
factor here? What are your thoughts?
Logan:
There's an interesting form factor discussion. I think on one hand, the AI...
Logan:
Sort of answer market is definitely distinctly different, it feels like,
Logan:
than the search market to a certain degree.
Logan:
Like, I think we've seen lots of AI products reach hundreds of millions of users
Logan:
and, you know, search continues to be a great business and there's billions of people using it and.
Ejaaz:
All that stuff.
Logan:
There's also this interesting question, which is like, what's the obligation
Logan:
of Google in this moment of this platform shift and all this innovation that's happening?
Logan:
And I, You know, as somebody who doesn't work on search, but,
Logan:
you know, is a fan of all the work that's happening inside of Google and has
Logan:
empathy for folks building these products, it is really interesting.
Logan:
And like my perspective has always been that search actually has this,
Logan:
you know, as the front door to the internet has this stewardship position that
Logan:
makes it so that they actually can't disrupt themselves for the right reasons
Logan:
at the same pace that that sort of,
Logan:
you know, small players in the market are able to do.
Logan:
And my assertion has always been that like, actually, this is the best thing
Logan:
for the world, the best thing for the world and for the Internet and for this
Logan:
entire economy that that Google has enabled through the Internet and bringing
Logan:
people to websites and all this stuff doesn't benefit by like,
Logan:
you know, day one of the LLM revolution happening.
Logan:
All of a sudden it's like a fully llm
Logan:
powered search product and like feels and looks completely different
Logan:
not only i think whether that throw you know users who
Logan:
are still trying to figure out like how do i use this technology what is the
Logan:
way that i should be engaging with it um what are the things that it works well
Logan:
for and it doesn't work for not only to throw those people into a bad um perspective
Logan:
from like a user uh from a user journey but i think it also has impacts on like
Logan:
people who rely on Google from a business perspective.
Logan:
So I think you've seen this sort of like gradual transition and like lots of
Logan:
shots on goal and lots of experiments happening on the search side.
Logan:
And I think we're now getting to the place where like they have confidence that
Logan:
they could do this in a way that is going to be super positive for the ecosystem
Logan:
and is going to create lots of value for people who are going and using these products.
Logan:
Like the understanding of AI technology has increased the adoption
Logan:
and the models have gotten better and hallucinations have gone down and all
Logan:
this stuff and I think there'll be also some like uniquely search things that
Logan:
like only search can do and I've spent a bunch of time with folks on the search
Logan:
team like Robbie Stein as an example who leads all the AI stuff in search and,
Logan:
There's all of this infrastructure that search has built, which as you think
Logan:
about this age of AI, where the ability to generate content,
Logan:
which actually like looks somewhat plausible.
Logan:
Has basically gone to zero. Like it's very easy to do that.
Logan:
Great search is actually more, it's like this premium is like more important than ever.
Logan:
There's going to be a million X or a thousand X or whatever than X number of
Logan:
like growth and content on the internet.
Logan:
How do you actually get people to
Logan:
the most relevant content from people who have authority who
Logan:
have you know expertise and all this stuff it's a
Logan:
really difficult problem and it's like it is the problem of
Logan:
the decade that like search has been solving for the
Logan:
last 20 years and is now a more important problem than ever um so i'm i've never
Logan:
been more excited for for the search team and like i think they've never had
Logan:
a bigger challenge ahead of them as they try to like figure out how to make
Logan:
you know these internet scale systems that they build continue to scale up to
Logan:
solve this next generation of problems,
Logan:
while also becoming this frontier AI product experience where billions of people
Logan:
are experiencing AI for the first time in a different way than they've done.
Logan:
There's so many interesting use cases too,
Logan:
even around like image search is a great example of this like new sort of,
Logan:
it's like one of the fastest growing ways in which people are using search now
Logan:
is showing up with an image and asking questions about it.
Logan:
And just like the way people had just traditionally used search is already changed.
Logan:
It's like different than it was five years ago or even two years ago.
Logan:
And I think we're gonna continue to see that happen. I think search as the product
Logan:
you see today will evolve to have things like multi-line text input fields as
Logan:
sort of user questions change and all that stuff.
Logan:
So there's so much cool stuff on the horizon for search that I'm really excited.
Josh:
Yeah, as I'm hearing you describe all of these cool new things,
Josh:
particularly funneling into a single model, like the science breakthroughs are unbelievable.
Josh:
And I think that's what gets me personally really excited, like Ejaz,
Josh:
is this is actually going to help people.
Josh:
Like this is going to make a difference in people's lives. Right now,
Josh:
it's a productive thing. It's a fun thing. It's a creative thing.
Josh:
There's a lot of tools. But then there's also the science part and a lot of
Josh:
this all funneling down to one amazing model.
Josh:
I think it leaves us in a really exciting place to wrap up this conversation.
Josh:
So Logan, thank you so much for coming and sharing all of this,
Josh:
sharing the news about the new model, sharing all of the updates and progress
Josh:
that you're making everywhere else. I really enjoyed the conversation.
Josh:
For you, you also have a podcast called Around the Problems.
Josh:
Is there anything you want to leave listeners with to go check it out or to
Josh:
check out the new AI studio or the new AI model?
Josh:
Let us know what you have interesting going on in your life.
Logan:
I love seeing feedback about AI studio. So if you have things that don't work
Logan:
that you wish worked, even for both of you, please send them to me.
Logan:
Would love to make things better. for the new model as well.
Logan:
Like if there's, I think this is like still, this is still early days of what
Logan:
this model is going to be capable of.
Logan:
So if folks have feedback on like edge cases or use cases that don't work well,
Logan:
please reach out to our team.
Logan:
Send us examples on Axe or Twitter or the like.
Logan:
Would love to help make some of those use cases come to life.
Logan:
And I appreciate both of you for all the thoughtful questions and for the conversation.
Logan:
This was a ton of fun. We got to do it again sometime.
Josh:
Awesome. Yeah, we'd love to. Anytime. time please come and join us we really
Josh:
enjoyed the conversation so thank you so much for watching for the people who
Josh:
enjoyed please don't forget to uh to like share it with your friends and do
Josh:
all the good things and we'll be back again for another episode soon thank you so much
Josh:
i have a fun little bonus for those of you still listening all the way to the
Josh:
end the real fans when we were first going to record with logan we actually
Josh:
had no idea that he would break the exclusive news of nano banana on our show
Josh:
it was super cool so we wanted to kind of restructure the episode to prioritize that at the front,
Josh:
we did record a separate intro where I said, hey, Google makes some really good stuff.
Josh:
In fact, you guys have an 80 something percent chance of being the best model
Josh:
in the world by the end of this month.
Josh:
Can you explain to us why, why Google is so amazing at what they do?
Josh:
And this was the answer to the question. So here's a nice little nugget for
Josh:
the end to take you out of the episode.
Josh:
Thanks for listening. I really hope you enjoyed and we'll see you guys in the next.
Logan:
My general worldview of like why Google is in such a good place for AI right now.
Logan:
There's many layers of this, depending on sort of what vantage point you want to look at.
Logan:
I think on one hand, it's like, I think search is this like incredible part
Logan:
of this story, which I think people have historically looked at Google search
Logan:
as this legacy Google product.
Logan:
And I think search is going through this transition and is actually like today
Logan:
actually just announced as we're recording this earlier, that AI mode is rolling
Logan:
out to 180 plus countries.
Logan:
English supported right now, and hopefully other languages in the future and is a great example of.
Logan:
Ai overviews and ai overviews sort of double
Logan:
clicking into ai mode being this
Logan:
product that actually like for many
Logan:
people around for for billions of people around the
Logan:
world is the first ai product experience that they
Logan:
actually touch um and i think there's like something
Logan:
really interesting where like google has been on this
Logan:
mission of like deploying ai and like you know there's some
Logan:
you know the the some naysayers on
Logan:
twitter will be like you know google created the transformer and then
Logan:
did nothing with it and it's actually uh very far from
Logan:
the truth which is search has been this like transformer which
Logan:
is the architecture that powers language models and uh
Logan:
and gemini um has been powering that experience with this
Logan:
technology for the last like seven years the product experience maybe
Logan:
looks slightly different today than it did then um but google's been an ai first
Logan:
company uh for as long as i can remember um basically as long as ai has existed
Logan:
that's been the case and now we're seeing more and more of these product surfaces
Logan:
like become these frontier AI products
Logan:
as sort of Google builds the infrastructure to make that the case.
Logan:
I think people also forget like it's not easy logistically to deploy AI to billions
Logan:
of people around the world. And now as you look at like, I think Google has
Logan:
like five or six billion plus user products.
Logan:
So the challenges of like, even just making a small AI product work today,
Logan:
if anyone's played around with stuff or tried vibe coding something like it's not easy,
Logan:
doing that at the billion user scale is also very difficult um
Logan:
so i i continue to be more and more bullish and
Logan:
like part of the thing that allows us to do that billion user scale
Logan:
deployment is the whole infrastructure story like if you're watching on video
Logan:
i don't know if you can see but i have a couple of tpus sitting behind me um
Logan:
yeah and like that tpu advantage which is our sort of equivalent to gpus um
Logan:
is something that i think is going to continue to play out so there's there's
Logan:
so many things that I get excited about,
Logan:
and the future is looking very bright.
