Announcing Google's Secret New AI Model With The Person Who Built It | Logan Kilpatrick

Logan:
The best image generation and editing model in the world.

Ejaaz:
It's scary how realistic this stuff is. VO3 has kind of like killed the VFX studio.

Logan:
And this is, I think, principally enabled by Vibe Coding. My hope is that it

Logan:
actually ends up creating more opportunity for the experts and the specialists.

Josh:
How much of the tools that you build do you find are built with Vibe Coding?

Logan:
I'm like almost 85% of everything that I do Vibe Coded.

Ejaaz:
I remember when I first booted up a PC and I just had access to all these different

Ejaaz:
wonderful applications all within one suite. This kind of feels like that moment for AI.

Josh:
Gemini is feeling faster, but it's also feeling better, and it's also getting cheaper.

Ejaaz:
What's happening behind the scenes?

Logan:
We crossed quadrillion tokens, which comes after a trillion if you're not.

Logan:
You haven't thought about numbers higher than a trillion before.

Logan:
It's what comes after a trillion, and there's no slowdown in sight.

Josh:
We have an incredibly exciting episode today because we are joined by Logan

Josh:
Kilpatrick. Logan is the product lead working on the Gemini platform at Google DeepMind.

Josh:
We have an exciting announcement to break right here today with Logan,

Josh:
which is the announcement of a model that we previously knew as Nano Banana.

Josh:
The reality is this is a brand new image generation model coming out of Google

Josh:
and you can access it today.

Josh:
So Logan, tell us about this brand new model and what we need to be excited about.

Logan:
Yeah, for people who are not chronically online and seeing all the tweets and everything like that,

Logan:
part of the excitement has been in over the last, I think, like six months,

Logan:
we've seen the emergence of like native image

Logan:
generation editing models historically um you

Logan:
would see models that could actually do a really good job of generating

Logan:
images um they were usually like tend to be like very like beautiful um aesthetic

Logan:
images the challenge was like how do you actually use these things in practice

Logan:
to do a lot of stuff that's where this editing capability is really helpful

Logan:
um and then so we started to see these models that can actually edit images

Logan:
if you could provide an image,

Logan:
it would, and then prompt it, it would, it would actually change that image.

Logan:
What's really interesting though, is this fusion of those two capabilities with

Logan:
the actual base intelligence of the Gemini model.

Logan:
And there's a lot of really cool ways in which this like manifests itself.

Logan:
And we'll look at some examples of this.

Logan:
But it's this benefit of the world knowledge. The model is like smart.

Logan:
So as you, as you ask it to do things, and as you ask it to make changes,

Logan:
it doesn't just like take what you're saying at face value.

Logan:
It takes what you're saying in the context of its understanding of the world

Logan:
and its understanding of physics, its understanding of light and all this other

Logan:
stuff, and it makes those changes.

Logan:
So it's not just blindly making edits or generations. They're actually grounded

Logan:
in reality and context in which that's useful.

Logan:
And we can look at some examples of this. My favorite thing is actually this editing capability.

Logan:
So this is an AI studio, and we'll have a link somewhere, hopefully in the show

Logan:
notes, that will let us do this. My friend Amar, who...

Logan:
Is on our team and drives all of our design stuff um build

Logan:
this and it's called past forward and what you can do is you can

Logan:
put in an image of yourself and it'll regenerate a

Logan:
version of yourself um in this

Logan:
sort of like polaroid-esque vibe um following all the different trends from

Logan:
the last 10 or 20 uh 30 years um so if you look at this example this is from

Logan:
me from the 1950s and And I'm sure I have a picture of my dad from the 1950s

Logan:
somewhere or my grandpa who looks somewhat similar to that.

Logan:
Here's me in the 1980s, which I love. Here's me.

Ejaaz:
Some of these facial expressions are also different. Like you're showing your

Ejaaz:
teeth more in some and then it's a smirk in others. That's super cool.

Logan:
I like this sweater. I actually have a sweater that almost looks exactly like

Logan:
this 1970s one, though I don't like my hair in this 1970s one.

Logan:
Same with the 2000s. um so one of the cool things

Logan:
about this uh this new model and one of the features i

Logan:
think folks are going to be most excited about is this character consistency which

Logan:
is as you took the original image um and you

Logan:
made the translation to this 1950s image it actually

Logan:
looks like me still um which is really cool so

Logan:
there's lots of these really interesting use cases i think we'll

Logan:
we'll go out with a um um like a

Logan:
sports card demo where you can sort of turn yourself into

Logan:
a you know a figurine sports card um

Logan:
which is really cool so lots of really interesting examples like

Logan:
this um and another thing you'll notice is actually the

Logan:
speed and this is where the underlying model is not

Logan:
uh the code name was nano nano banana um

Logan:
the actual model is built on gemini 2.5 flash which is our workhorse model it's

Logan:
super fast it's super efficient um it's relatively priced in the market which

Logan:
is awesome so you can actually use it at scale um and yeah so this model behind

Logan:
the scenes or for developers who people want to build with it is Gemini 2.5 flash image,

Logan:
which is awesome. So this is a use case that I love.

Logan:
And it's a ton of fun. You can do this in the Gemini app or in AI Studio.

Ejaaz:
I mean, as you said, the character consistency just from these examples is like astounding.

Josh:
I need to give a round of applause. This has been my biggest issue when I'm

Josh:
generating images of myself.

Ejaaz:
Genuinely. And Josh and I are early users of, you know, Mid Journey V1,

Ejaaz:
OpenAI's Image Generator as well.

Ejaaz:
And one of our pet peeves was it just couldn't do the most simplistic things,

Ejaaz:
right? We could just say, hey, keep this photo and portrait of me exactly the same.

Ejaaz:
But can you show me what I would look like in a different hairstyle or me holding

Ejaaz:
a bottle of Coca-Cola instead of this martini?

Ejaaz:
And it just could not do that, right? Just simple video, like photo editing.

Ejaaz:
Can you give us a bit of a background as to what Google did to be able to achieve this?

Ejaaz:
Because, you know, I've been racking my head around like, why other AI companies

Ejaaz:
couldn't do this? Like, what's happening behind the scenes? Can you give us a bit of insight?

Logan:
Yeah, that's a good question. I think this actually goes back to and I'll share

Logan:
another example in a second as well.

Logan:
But I think this goes back to this story of what happens when you build a model

Logan:
that has the fusion of all these capabilities together.

Logan:
And I was actually just This is a sort of parallel example to this,

Logan:
but it's another example of why building a unified model to do all this stuff

Logan:
and not having a separate model that doesn't have world knowledge and all these

Logan:
other capabilities is useful. The same thing is actually true on video.

Logan:
Like part of the story, and we haven't, we have a bunch of stuff coming that

Logan:
sort of tells this a little bit more elegantly than I will right now.

Logan:
But part of the story of like VO3 having this really state-of-the-art video

Logan:
generation capabilities, if folks have seen this, is that the Gemini models

Logan:
themselves have this state-of-the-art video understanding capabilities.

Logan:
And a very similar context, actually, on the image side, which is since the

Logan:
original Gemini model, we've like,

Logan:
with the exception of probably a couple of months in that like two and a half

Logan:
year time horizon, have had state-of-the-art image understanding capabilities.

Logan:
And I think there is this like capability transfer, which is really interesting

Logan:
as you go to do the generation step.

Logan:
And if you can fuse those two things together in the same model,

Logan:
you end up just being able to do things that other models aren't able to do.

Logan:
And this was part of the bet originally that like, why build Gemini to be the original Gemini?

Logan:
Gemini 1.0 model was built to be natively multimodal. It was built to be natively

Logan:
multimodal because the belief at the time,

Logan:
and I think this is turning out to be true is that like that's on the path to

Logan:
AGI is that you combine these capabilities together and like similar to what

Logan:
humans are able to do we have this fusion of all these capabilities in a single

Logan:
entity just like these models should be able to do.

Ejaaz:
Wow. So if I were to distill what you just said here, Logan,

Ejaaz:
the way you've trained Gemini 2.5 or all future Google Gemini models is it's

Ejaaz:
in a very multimodal fashion.

Ejaaz:
So you're basically, it gets smarter in one particular facet,

Ejaaz:
which trains itself or has transferable capabilities to other facets,

Ejaaz:
whether it's image generation, video generation, or even text LLMs to some extent.

Ejaaz:
I just think that's fascinating.

Ejaaz:
I'm curious, I have one question for you, which I want to hear your take on.

Ejaaz:
How are you going to surface this to the regular consumer, right?

Ejaaz:
Because right now, you provide all of these capabilities through an amazing

Ejaaz:
suite, you know, called Google AI Studio.

Ejaaz:
But if I wanted to use this in, say, an Instagram app, or my random photo imaging

Ejaaz:
editing app, is this something that could be easily proved to someone or sourced?

Ejaaz:
Or do we need to go via some other route right now.

Logan:
Let me just diverge really quickly, which is if any of the researchers who I

Logan:
work with are watching this, they will tell me, they'll make sure that I know

Logan:
that capability transfer that we just talked about, you oftentimes don't get that out of the box.

Logan:
So there is some emergence where you get a little bit of that.

Logan:
You do have to do, there's real true research and engineering work that has

Logan:
to happen to make sure that that capability fusion happens.

Logan:
It's not often that you just like make the model really

Logan:
good at one thing and then it translates oftentimes actually it's

Logan:
like uh it has a negative effect which is as you

Logan:
make the models really good at code for example you trade

Logan:
that off against some other you know creative writing as as a random example

Logan:
of this um so you have to do a lot of like active research and engineering work

Logan:
to make sure that you don't lose a capability as you make another one better

Logan:
but then ultimately they benefit as if you can make them on the same level they

Logan:
benefit from this interleaved capability together.

Logan:
To answer the question about like, where is this going to be available?

Logan:
The Gemini app is the place that like for by and large, most people should be going to.

Logan:
So if you go to Gemini.Google.com, there'll be sort of a landing page experience

Logan:
that showcases this new model and makes it really easy.

Logan:
And you can put in all your images and do tons of fun stuff like the example that I was showing.

Logan:
If you're a developer and you want to build something with this,

Logan:
in AI Studio, we have this build tab.

Logan:
And that's what we were just looking at as an example of one of the applets

Logan:
that's available in the build tab,

Logan:
the general essence is that all of these applets can be forked and remixed and

Logan:
edited and modified so that you can keep doing all the things that you want

Logan:
to do with the AI capability built in.

Logan:
So it'll continue to be powered by the same model. It'll do all that stuff, which is awesome.

Logan:
So there's lots of cool fusion capabilities that we have with this.

Logan:
Same thing with this other example that we're looking at. So if you want to

Logan:
go outside of this environment, we have an API.

Logan:
You could go and build whatever. saw if your website is, you know,

Logan:
AI photos.com or whatever, you could go and build with the Gemini API,

Logan:
use the new Gemini 2.5 flash image model to do a bunch of this stuff, which is awesome.

Josh:
Awesome. So while this is baking, I noticed you had another tab open,

Josh:
which means maybe there's another demo that you were prepared to share.

Logan:
There is another demo. This one I actually haven't tried yet.

Logan:
But it's this idea of like, how can you take a photo editing experience and

Logan:
make it super, super simple? So I'll grab an image.

Logan:
Actually, we'll take this picture, which is a picture of Demis and I.

Josh:
Legends.

Logan:
We'll put an anime filter on it and we'll see.

Logan:
And so this is a completely vibe coded UI experience and all the code behind

Logan:
the scenes is vibe coded as well.

Logan:
And we'll see how well this works with Demis and I.

Josh:
How much of the tools that you build do you find are built with vibe coding

Josh:
instead of just hard coding software?

Josh:
Are you writing a lot of this as vibe coded through the Gemini model?

Logan:
I think you sometimes you're able to do some of the stuff,

Logan:
completely vibe coded um it depends on like

Logan:
how specific that you want to do i do i'm like almost 85

Logan:
of everything that i do vibe coded somebody else on

Logan:
my team built this one so i don't want to misrepresent the work it could

Logan:
have it could have all been human programmed because we have an incredible set

Logan:
of engineers the general idea is how can you make this oh interesting how can

Logan:
you make this photoshop like experience let's go 90 or do you have suggestions

Logan:
what would a good filter for this be i don't know oh.

Josh:
Man i yeah like perhaps uh going back to

Josh:
the last example maybe like a a 90s film or an

Josh:
80s film grain all right and i guess while we wait for that to load is there

Josh:
a simple way that you would describe a nano banana or this new image model to

Josh:
just the average person on the street who's oh look there we go we have the

Josh:
film grain okay so what we're watching for the people who are listening um you're

Josh:
retouching you can retouch parts of the image you could crop adjust there are filters to be applied

Logan:
I'm just clicking through buttons to be honest, I've never done it before. So it's been fun.

Logan:
Live demo, day one. This is the exploration you're going to get to do as a user

Logan:
as you play around with this.

Ejaaz:
Logan is vibe editing and that's what's happening. Yeah. He's experimenting.

Logan:
Vibe editing, which is fun. I love it. That's a great way. And the cool thing,

Logan:
again, is like what I love about this experience is as you're going through,

Logan:
oh, interesting, this one's like giving me edited outline.

Josh:
Oh, yeah, a little outline. This is helpful for our thumbnail generation. We do a lot of this stuff.

Logan:
Let's see if I can remove the background as well.

Ejaaz:
Oh, yeah. Let's see. I should be.

Josh:
If this removes the background, this is going to be trouble because this is

Josh:
a big feature that we use for a lot of our imagery.

Logan:
Hopefully. Come on. Oh, nice.

Ejaaz:
Oh, done.

Josh:
Nicely done.

Ejaaz:
For those of you who are listening, he's typed in, put me in the Library of

Ejaaz:
Congress. So we're going to hopefully see Logan.

Logan:
Yeah, the context on that image was that Demis and I were in the Library of the DeepMind office.

Ejaaz:
Oh, nice.

Logan:
Yeah, so that was the Library of Congress reference in my mind.

Logan:
But yeah, so much that you can do.

Logan:
Again, what I love about this experience is that as you go around and play with

Logan:
this stuff, if you want to modify this experience, you can do so on the left-hand side.

Logan:
If you say, actually, here are these five editing features that I really care

Logan:
about, the model will go and rewrite the code, and then it'll still be attached

Logan:
to this new 2.5 flash image model.

Logan:
So you can do all these types of cool stuff. This experience is something that

Logan:
I'm really excited about that we've been pushing on.

Josh:
Yeah, this is amazing because I myself, I do photography a lot.

Josh:
I was a photographer in my past life and I rely very heavily on Photoshop and

Josh:
Lightroom for editing, which is a very manual process.

Josh:
And they have these smart tools, but they're not quite like this.

Josh:
I mean, this saves a tremendous amount of time if I could just say,

Josh:
hey, realign, restrain the image, remove the background, add a filter.

Josh:
I think the plain English version of this makes it really approachable, but also way faster.

Logan:
Yeah, it is. It is crazy fast. I think about this all the time.

Logan:
Like there's definitely cases where you want to go deep with whatever the pro tool is.

Logan:
I think there's, there's actually something interesting, like on the near horizon

Logan:
that our team has thought a lot about, which is how you can have this experience

Logan:
and how you can sort of in a, in a generative UI capacity,

Logan:
have the experience sort of subtly expose additional detail to users.

Logan:
And I think about this like if you're a new you know photoshop user

Logan:
as an example and you show up like the chance that you're

Logan:
going to use all of the bells and whistles is zero like you want

Logan:
like the three things i want to remove a background i want to crop something

Logan:
whatever it is like don't actually show this all of these bells and whistles

Logan:
i think the exciting thing about like the progress on coding models is that

Logan:
in the future the challenge with the challenge with doing this in the present

Logan:
rather is that software is deterministic.

Logan:
You have to build software to build the sort of like modified version of that

Logan:
software for all of these different like skill sets and use cases is extremely

Logan:
expensive. It's not feasible.

Logan:
It doesn't scale to production environments. But if you can have this generative

Logan:
UI capability where like the model sort of knows and as you talk to the model,

Logan:
it realizes, oh, you might actually benefit from these other things.

Logan:
It can create the code to do that on the fly and expose them to you,

Logan:
which is really interesting.

Logan:
So I think there's lots of stuff that is going to be possible as the models keep getting better.

Josh:
This is amazing. So the TLDR on this new announcement, how would I,

Josh:
if I were to go explain to my friend what this does, why this is special,

Josh:
how would you kind of sell it to me?

Logan:
The best image generation and editing model in the world, 2.5 flash image or

Logan:
nano banana, whichever you prefer, is the model that can do this.

Logan:
And I think there's so many creative use cases where you're actually bounded

Logan:
by the creative tool. And I feel like this is one of these examples to me where

Logan:
it's like, I feel like I'm 10x more.

Logan:
I was literally helping my friend yesterday doing a bunch of iterations on his

Logan:
LinkedIn picture because it was like, you know, the background was slightly

Logan:
weird or something like that.

Logan:
And we were just like, I did like 15 iterations and now he's got a great new

Logan:
LinkedIn background, which is awesome.

Logan:
So like, there's so many like actual practical use cases where you,

Logan:
and I literally just like built a custom tool on the fly vibe coding in order

Logan:
to solve that use case, which was a ton of fun.

Josh:
Yeah, this is so cool. Okay, so this model, Nano Banana Gemini 2.5 Flash Image Gen, it's out today.

Josh:
So we'll link that in the description for people who want to try it out.

Josh:
I think one of my complaints for the longest time, and I've mentioned this on

Josh:
the show a few times, is a lot of times when I'm engaging with this incredible

Josh:
form of intelligence, I just have a text box.

Josh:
And it's up to me to kind of pull the creativity out of my own mind.

Josh:
And I don't get a lot of help along the way. but one of the things that you

Josh:
spend your time in is this thing called Google AI Studio.

Josh:
And I've used AI Studio a lot because it solves a problem for me that was annoying,

Josh:
which is just the blank text box.

Josh:
It kind of has a lot of prompts. It has a lot of helpers. It has a lot of guidance

Josh:
into helping me extract value out of the model.

Josh:
So what I'd love for you to do for people who aren't familiar,

Josh:
Logan, is just kind of explain to everyone what Google AI Studio is and why

Josh:
it's so important and why it's so great.

Logan:
Yeah, I love this, Josh. I appreciate that you like using AI Studio.

Logan:
Go. It is a labor of love. Lots of

Logan:
people across Google have put in a ton of time to make progress on this.

Logan:
I really want to, so I'll make a caveat, which is we have this entirely redesigned

Logan:
AI Studio experience that's coming very soon.

Logan:
I won't spoil it in this episode because it's like half faked right now.

Logan:
And I wish I could show, and I think actually some of the features that you

Logan:
might see in this UI might be slightly different at launch time than what you see here.

Logan:
So take this with a grain of salt. We've got a bunch of new stuff coming.

Logan:
And I think actually it should help with this problem that you're describing,

Logan:
which is as you show up to a bunch of these tools today, the onus is really

Logan:
on you as a user to like try to figure out what's capable,

Logan:
what all the different models are capable of, what even are all the different

Logan:
models, like all of that stuff.

Logan:
So at a high level, like we built AI Studio for this like AI builder audience.

Logan:
If you want to take AI models and actually build something with them and not

Logan:
just, you know, chat to AI models, this is the product that was built for you.

Logan:
We have a way to, in this like chat UI experience, sort of play with the different

Logan:
capabilities of the model, feel what's possible. What is Gemini good at? What's it not good at?

Logan:
What are the different tools it has access to? But as you go into iStudio,

Logan:
you'll see something that looks like this.

Logan:
You know, we're highlighting a bunch of the new capabilities that we have right now.

Logan:
This URL context tool, which is really great for information retrieval,

Logan:
this native speech generation capability, which is really cool.

Logan:
Folks have used notebook lm and you want to build a

Logan:
notebook lm like experience um we have

Logan:
an api for for people who want to build something like that and

Logan:
we have this live audio to audio dialogue experience where

Logan:
you can share a screen with the model and talk to it and it can see the things

Logan:
that you see and engage with it of course we have our native image generation

Logan:
and editing model the old version 2.0 flash now the new version 2.5 flash um

Logan:
and lots of other stuff that's available as you sort of experience what these models are capable of.

Logan:
Um, so really this playground experience is one version. We have this chat prompt

Logan:
on the left-hand side. We have this stream.

Logan:
This is where you can talk to Gemini and sort of share your screen.

Logan:
And, um, actually you can like show it things on the webcam and be like, what's this?

Logan:
How do I use this thing? You can do this on mobile as well, which is really cool.

Logan:
We have this generative media experience where like, if you want to build things

Logan:
with, we have a music model, we have a VO, which is our video generation model.

Logan:
We have all the text to speech stuff, which is really cool.

Logan:
As I overwhelm people with so much stuff that you can do in AI Studio.

Logan:
The sort of key threat of all this is we built AI Studio to showcase a bunch of these capabilities.

Logan:
And everything you see in AI Studio has an underlying sort of API and developer experience.

Logan:
So if you want to build something like any of these experiences, all of this is possible.

Logan:
There's like no Google secret magic that's happening pretty much anywhere in AI Studio.

Logan:
It's all things that you could build as someone uh using a

Logan:
vibe coding product or you know by hand writing the

Logan:
code um you could build all these things and even more

Logan:
um and that is the perfect segue to this

Logan:
build tab where we're trying to help also you know actually help you get started

Logan:
building a bunch of stuff so you can use these templates that we have you can

Logan:
use a bunch of the suggestions you can look through our gallery of a different

Logan:
stuff um and we're really in this experience trying to help you build ai powered

Logan:
apps which we think is something that folks are really really excited about and,

Logan:
we'll have much more to share around all the ai app building stuff uh in the near future.

Josh:
Awesome thanks for the rundown so as i'm looking at this i'm wondering who do

Josh:
you think this is for what type of person should come to ai studio and tinker around here

Logan:
Yeah so i think you know historically and and so you'll see a little bit of

Logan:
this transition if you play around the product where there's some interesting

Logan:
edges we were originally focused on building for developers So it was built

Logan:
and there is like a part of the experience which like is tied to the Gemini

Logan:
API, which tends to be used mostly by developers.

Logan:
So if you go to dashboard, you can see all your API keys and check your usage

Logan:
and billing and things like that. By and large, though, I think the really cool

Logan:
opportunity of what's happening right now is this transition of like who is creating software.

Logan:
And this is, I think, principally enabled by Vibe Coding.

Logan:
And because of that, like we've recentered ourselves to be really focused on

Logan:
this AI builder persona, which is like people who want to build things using AI tools.

Logan:
Also, people who are trying to build AI experiences, we think is going to be

Logan:
the market that creates value for the world.

Logan:
So if you're excited about all the things that you're seeing,

Logan:
if you want to build things, AI Studio is very much like a builder first platform.

Logan:
If you're just looking for like a great everyday AI assistant product,

Logan:
you, you know, want to get help on coding questions or homework or life advice

Logan:
or all that type of stuff,

Logan:
the Gemini app is the right place for this. It's very much like a.

Logan:
DAU type of product where like you come back and it has memory and personalization

Logan:
and all this other stuff, um, which makes it really great as like an,

Logan:
uh, as a, as an assistant to help you in your life versus AI studio.

Logan:
The artifact is like, we help you create something and then you go put that

Logan:
thing into the world in some way.

Logan:
Um, and you don't necessarily need to come back and use it every day.

Logan:
You use it whenever you want to build something.

Ejaaz:
It's funny. Um, I'm dating myself a bit here,

Ejaaz:
but I remember when I first booted up a PC and I loaded up Microsoft Office

Ejaaz:
and I just had access to all these different wonderful applications that were

Ejaaz:
at the time super new or within one suite.

Ejaaz:
This kind of feels like that moment for AI.

Ejaaz:
And you might not take that as a compliment because it's a completely different

Ejaaz:
company, but it was what I built my childhood off of and my fascination with computers.

Ejaaz:
So I appreciate this and I love that it's this massively like cohesive experience.

Ejaaz:
But kind of zooming out, Logan, I was thinking a lot about Google AI and what

Ejaaz:
that means to me personally.

Ejaaz:
I have to say it's the only company that I think beyond an LLM.

Ejaaz:
And what I mean by that is when I think of Google AI, I don't just think of Gemini.

Ejaaz:
I think of the amazing image gen stuff that you have.

Ejaaz:
I think of the amazing video outputs that you guys have. I think of the text-to-voice

Ejaaz:
generation that you just demoed and all those kinds of things.

Ejaaz:
I remember seeing this advert that appeared on my timeline.

Ejaaz:
And I remember thinking, wow, this must be the new GTA. Then I was like,

Ejaaz:
no, no, that's Florida. That's Miami.

Ejaaz:
No, people are doing wild stuff. That's an alien. Hang on a second. This can't be real.

Ejaaz:
And then I learned that it was a Google VO3 generation of an advert for Kalshi,

Ejaaz:
which is like this, you know, prediction markets situation.

Ejaaz:
And I remember thinking, how on earth have we got to AI generated video that

Ejaaz:
is this high quality and this high fidelity?

Ejaaz:
I think in my mind, VO3 has kind of like killed the VFX studio.

Ejaaz:
It's kind of killed a lot of Hollywood production studios as well.

Ejaaz:
Give me a breakdown and insight into how you built or how you guys built VO3

Ejaaz:
and what that means for the future of movie video production and more

Logan:
Yeah that's a great question i think there's something really interesting along

Logan:
these threads and and not to not to push back on the notion that it's killing

Logan:
hollywood because i think i think there is like um,

Logan:
I think it's an interesting conversation. The way that I have seen this play

Logan:
out and the great example of this, that folks have seen Flow,

Logan:
which is our sort of like creative video tool.

Logan:
And if you're using VO and you want to sort of get the most out of VO,

Logan:
Flow is the tool to do that.

Logan:
If you see lots of like the creators who are building, you know,

Logan:
minute long videos using VO and it's like this really cohesive story and it

Logan:
has like a clear visual identity, similar to what you'd get from like a.

Logan:
Probably not the extent of a Hollywood

Logan:
production, but like somebody thoughtfully choreographing a film.

Logan:
Flow is the product to do that. And actually interesting, like Flow was built

Logan:
in conjunction with filmmakers.

Logan:
And I think that's actually like there is, and I feel this way about vibe coding as well.

Logan:
And it's this thought experiment that I'm always running through in my head,

Logan:
which is, yes, I think AI is like raising the bar forever or it's raising the

Logan:
floor for everyone. We're like, now everyone can create.

Logan:
What does that mean for people who have expertise?

Logan:
And I think in most cases, what it means is actually the value of your expertise

Logan:
continues to go up. And like, this is my personal bet.

Logan:
And I don't know how much this tracks to like everyone else's worldview.

Logan:
My personal bet is that expertise in the world where the floor is lifted for

Logan:
everyone across all these dimensions is actually more important because there was something about,

Logan:
and I think like video production is a great example for me because I would

Logan:
never have been able to make a video.

Logan:
Like it's not in the cards, like for my skillset, my creative ability,

Logan:
my financial ability, like I will never be able to make a video.

Logan:
I can make things with VO. Um, and now I'm like a little bit closer to imagining

Logan:
like, okay, if I'm serious about this, I need to go out and like actually engage with people.

Logan:
And I've like, sort of, it's like whetted my appetite in a way that I don't

Logan:
think I would. It was just like too far in a way.

Logan:
And I think software is another example where Vibe Coding, if you were to pull

Logan:
a random person off the street and you start talking to them about coding and

Logan:
seeing C++ and deploying stuff and all this, they're like, brain turns off, not interested.

Logan:
I don't want to learn to code. That's not cool. It's not fun. It sounds horrible.

Logan:
And then Vibe Coding rolls around and it's like, oh, wait, I can actually build

Logan:
stuff. And like, yeah, I don't really need to understand all of the details.

Logan:
But there's still a limit to what I can build and who is actually well positioned

Logan:
to help me take the next step.

Logan:
Like I, you know, vibe code something. I'm like, this is awesome.

Logan:
I share with my friends. They all love it.

Logan:
I want to, you know, go build a business around this thing that I vibe coded.

Logan:
There's still a software engineer that needs to help make that thing actually

Logan:
happen. So if anything, it's like it's increasing this.

Logan:
I mean, on the software side, there's this infinite demand for software,

Logan:
and it's increasing the total addressable market of like what software engineers

Logan:
need to help people build.

Logan:
I think there'll be something similar on the video side.

Logan:
You know, there will be downsides to AI technology in some ways.

Logan:
I think there is like as the technology shift happens, there is some amount

Logan:
of disruption that's taking place.

Logan:
And like someone's workflow is being disrupted. But I do think there's this

Logan:
really interesting thread to pull on, which is my hope is that it actually ends

Logan:
up creating more opportunity for the experts and the specialists.

Ejaaz:
So it sounds like you're not saying VFX Studio teams are going to be replaced by software engineers,

Ejaaz:
but rather that team in itself will become more adept at using these AI tools

Ejaaz:
and products to kind of enhance their own skill set beyond what it is today. Is that right?

Logan:
Yeah, yeah. And I think we've seen this already play out in some ways, which is interesting.

Logan:
I think code is a little bit wider distribution than perhaps the VFX.

Logan:
And it's VFX also in a space that I'm less familiar with personally.

Logan:
But yeah, I think this is likely what is going to play out if I had to guess and bet.

Ejaaz:
Can you help us understand how a product like VO3 gets used beyond just like

Ejaaz:
the major Hollywood productions?

Ejaaz:
Stuff, right? Because I've seen a bunch of these videos now.

Ejaaz:
And I'll be honest with you, Logan, it's scary how realistic this stuff is, right?

Ejaaz:
It's like from a high quality triple A game demo, all the way to something that

Ejaaz:
is shot like in an A24 film, you know, the scenes, the cuts, the changes.

Ejaaz:
I think it's awesome. I'm wondering whether that goes beyond entertainment in

Ejaaz:
any way. Do you have any thoughts or ideas there?

Logan:
Yeah, that is interesting. I think one of the ones that is like related to,

Logan:
it's sort of one skip away from video generation itself, which was Genie,

Logan:
which was our sort of world simulation work that was happening.

Logan:
I think if folks haven't seen this, go look up Genie 3 and you can see a video.

Logan:
It's mind blowing. It's like a fully playable game world simulation.

Logan:
You can like prompt on the go and this environment will change.

Logan:
You can control it on your keyboard similar to a game. I think that work translates

Logan:
actually really well to robotics, which is cool.

Logan:
So as you like one of the if folks aren't familiar with this,

Logan:
like one of the principal reasons we don't just have robots walking around everywhere.

Logan:
And the reason why we have LLMs that can actually do lots of useful stuff is

Logan:
it's this data problem, which is like there's lots of, you know,

Logan:
text data and other data that's like representative of the intelligence of humans

Logan:
and all this stuff that's available.

Logan:
There's actually not a lot of data that is useful for making robotics work.

Logan:
And I think VO could be part of, or like generally that sort of segment of video

Logan:
generation and this like physics understanding and all that other stuff,

Logan:
I think could be really helpful in actually making the long tail of robotics use cases work.

Logan:
Then I can finally have a robot that will fold my laundry so that I don't need

Logan:
to spend my time doing that.

Logan:
But that's my like outside of entertainment bet as far as like where that use

Logan:
case ends up creating value in the world.

Ejaaz:
With VO3, the goal is to enable humans to become a better version of themselves,

Ejaaz:
a 10x, 100x better version of themselves using these different tools.

Ejaaz:
So in the example of a VFX studio, you can now kind of like create much better movies.

Ejaaz:
How does that apply for Genie 3 exactly, right?

Ejaaz:
You gave the example of like being able to create simulated environments,

Ejaaz:
but that's to train these robots. That's to train these models.

Ejaaz:
What about us? What about the flesh humans that are out there?

Ejaaz:
Can you give us some examples about where this might be applied or used?

Logan:
Yeah, that's a good example. I mean, the robot answer is like the robots will

Logan:
be there to help us, which is nice.

Logan:
So hopefully there's a bunch of stuff that you don't want to do that you'll

Logan:
be able to get your robot to do.

Logan:
Or there's like industries that are like dangerous for humans to operate in

Logan:
where it's like if you can sort of do that simulation without needing to collect

Logan:
a bunch of human data to do those things, I could see that being super valuable.

Logan:
I think my initial reaction to the Genie use case, like I could see lots of,

Logan:
actually, the two that come to mind is like one entertainment I think will be cool.

Logan:
Humans want to be entertained. It's a story as all this time.

Logan:
I think there will be some entertainment value of a product experience like Genie.

Logan:
Um, I think the other one is actually back to a bunch of use cases where you'd

Logan:
actually want robotics to be able to do some of that work that don't yet, uh,

Logan:
the robot product experience, like isn't actually there.

Logan:
Um, this could be things like, you know, mining or like heavy industries,

Logan:
things like that, where like, there's actually like a safety aspect of like,

Logan:
how can you do these like realistic simulation training experiences, um,

Logan:
in order to make sure that like you're You don't have to like physically put

Logan:
yourself in harm's way in order to like understand the bounds or like the failure cases,

Logan:
like disaster recovery, things like that, where it would be you don't want to

Logan:
have to show up at a hurricane the first time to like really understand what

Logan:
the environment could be like.

Logan:
And like being able to do those types of simulations is interesting and building

Logan:
software deterministically to solve that problem would actually be really difficult

Logan:
and expensive and like probably isn't a large market that lots of companies are going to go after.

Logan:
But if you have this model that has really great world knowledge,

Logan:
you can throw all these random variables at it and like sort of do that type

Logan:
of like training and simulation.

Logan:
So yeah, it's perhaps an interesting use case.

Logan:
I don't know if there's actually a plan to use it for things like that,

Logan:
But those are things that come to mind.

Josh:
This is something I've been dying to ask you about because this is something

Josh:
that I've been fascinated by.

Josh:
When I watched the Genie 3 demo for the first time, it just kind of shattered

Josh:
my perception of where we were at because you see it work.

Josh:
And I saw this great demo where someone was painting the wall.

Josh:
We actually filmed an entire episode about this and it retained all of the information.

Josh:
And one theme, as I'm hearing you describe these things, as I'm hearing you

Josh:
describe VO3, Genie 3, you are building this deep understanding of the physical world.

Josh:
And I can't help but notice this trend like you are just starting

Josh:
to understand the world more and more and I could see this when

Josh:
it comes to making games as an example where like a lot

Josh:
of people were using genie 3 to just make these like not necessarily

Josh:
games but virtual worlds that you can walk around and interact with and I'm

Josh:
wondering if you could just kind of share the long-term reasoning why because

Josh:
clearly there's a reason there's a lot of value to it is it from being able

Josh:
to create maybe artificial data for robots if you can emulate the physical world

Josh:
you can create data to train these robots is it because it creates great experiences

Josh:
like perhaps we'll AAA design studios using Genie 5 to make AAA games like Grand Theft Auto.

Josh:
I'm curious the reasoning behind

Josh:
this like urge to understand the physical world and emulate it even.

Logan:
I had a conversation with Demis about this who's our CEO at DeepMind and someone

Logan:
who's been pushing on this for a long time.

Logan:
I think a lot of this goes back to like there's two dimensions.

Logan:
It goes back to like the original ethos of like why DeepMind was created and

Logan:
a bunch of the work the initial work that was happening in DeepMind around reinforcement learning.

Logan:
If folks haven't seen this, one of the challenges of, again,

Logan:
making AI work is that you need this flywheel of continuing to iterate and you

Logan:
need a reward function, which is what is the actual outcome that you're trying to achieve?

Logan:
And the thing that's interesting about these simulated environments is it's

Logan:
really easy to have a constrained environment.

Logan:
World, and it's really easy to also, or not maybe really easy,

Logan:
is overly ambitious. It's possible to define a simple reward function and then

Logan:
actually infinitely scale this up.

Logan:
And the opposite example of this, if folks have saw there was some work a very

Logan:
long time ago, and this is in the AI weeds, but there was this hand,

Logan:
this physical hand that could like robotic hand that could manipulate a Rubik's cube.

Logan:
And they were using AI to like help try to solve this Rubik's cube.

Logan:
And the, the, again, the analogy of why this, of why Genie and some of this

Logan:
work is so interesting is if you were to go and try to like,

Logan:
Hey, we need all the data to go and try to make this little hand,

Logan:
physical robotic hand, be able to do this.

Logan:
It's actually really challenging to scale that up. You need to go and build a bunch of hands.

Logan:
You need to like, what happens when the rubik's cube drops you need

Logan:
to have some system to like go and pick it back up and you just like go

Logan:
through the long tail of this stuff the hand probably can't run

Logan:
24 hours a day like there's all these challenges with getting um the the like

Logan:
data in that environment to scale up um and these virtual environments don't

Logan:
have this problem which is if you can emulate and like self-driving cars is

Logan:
another example of this like again for folks who aren't familiar lots of you You know,

Logan:
there's lots of real world data that's involved in self-driving cars.

Logan:
There's also lots of simulated environments where they've built simulations of the world.

Logan:
And this is how they can get like a thousand X scale up of this like data understanding

Logan:
is by having these simulated environments.

Logan:
Robotics will be exactly the same. If you want robotics to work,

Logan:
it's almost 100% true that you're going to have to have these simulated environments

Logan:
where the robot can fall down the stairs a thousand times.

Logan:
And that's okay because it's a simulated environment and it's not actually going

Logan:
to fall down your stairs.

Logan:
So I think Genie is, there is definitely like an entertainment aspect to it.

Logan:
I think it's more so going to be useful for this like simulated environment

Logan:
to help us not have to do things in the real world and, but still have like

Logan:
a really good proxy of what will happen in the real world when we do them.

Ejaaz:
That's pretty funny. I spent the weekend watching the World Robot Olympics and

Ejaaz:
there was some very real fails and crashes of these robots, which is pretty funny.

Ejaaz:
Okay, so when I think of Genie, I think that it blows my mind because I still

Ejaaz:
can't get my head around how it predicts what I'm going to look at.

Ejaaz:
I remember seeing this demo of someone just taking a simple video of them walking and,

Ejaaz:
you know, it was like a rainy day on a gravel path and they stuck that into

Ejaaz:
Genie 3 and they could look down and see their reflection in the puddle.

Ejaaz:
So the physics was astoundingly accurate and astute.

Ejaaz:
Can you give us a basic breakdown of how this works?

Ejaaz:
Is this like a real engine, game engine, like happening in the background?

Ejaaz:
Or is there something more deeper happening? Like help us understand.

Logan:
My intuition, and we can gut check this with folks on the research side to make

Logan:
sure that I'm not, I'm not fabricating my intuition.

Logan:
But if folks have an intuition as far as like how next token prediction works,

Logan:
which is at some given, like if you're looking through a sentence of text,

Logan:
for each word in that sentence, there's a

Logan:
distribution uh between like zero

Logan:
and one basically of like how likely that

Logan:
word was to be the next word in the sequence um and

Logan:
if you look the and if you like look through this is like the basic

Logan:
principle of llms um this is why you get like the you know if you're to ask

Logan:
the same question multiple times um the the llm will inherently perhaps give

Logan:
you a different answer And that's why like small changes in the inputs to LLMs

Logan:
actually change this because like, again, it's this distribution.

Logan:
So like if you make one letter difference, it perhaps like puts you on a like

Logan:
a branching trajectory that looks very different than what the original output

Logan:
that you got from the model.

Logan:
Similar similar like rough approximation of this just like much more computationally

Logan:
difficult and i think they use a bunch of um architectural differences that

Logan:
sort of it's not truly next token prediction that's happening for the sort.

Ejaaz:
Of like pixels colors bunch of other things yeah

Logan:
Exactly yeah so it's like you can like roughly map

Logan:
the mental model of like as the as a model

Logan:
looks down or as like the figure looks down

Logan:
in some in some environment like again it has

Logan:
all this like context of the state of the world but then it also knows like

Logan:
what are the pixels that are preceding it etc etc it like loosely is doing this

Logan:
like next next pixel prediction you could you could sort of approximate with

Logan:
um that's happening at the at the genie level which is which is an interesting way to think about it.

Josh:
So ijaz one of the things you were mentioning was that um it's happening much

Josh:
faster right and it's happening presumably much cheaper because now i heard this crazy stat.

Josh:
You're at like 500, hundreds of trillions of tokens per month that is being

Josh:
pushed out by Gemini. It's unbelievable.

Josh:
And I want to get into the kind of infrastructure that enables this because

Josh:
Gemini is feeling faster, but it's also feeling better. And it's also getting cheaper.

Josh:
And behind you earlier in the show, you mentioned you have a TPU.

Josh:
I understand TPUs are part of this solution. And I want you to kind of just

Josh:
walk us through how this is happening.

Josh:
How are we getting these quality and improvements across the board?

Josh:
And what type of hardware or software is enabling that to happen.

Logan:
I think like one, you have to give credit to like all of these infrastructure

Logan:
teams across Google that are making this happen.

Logan:
If you think, and I think about this a lot, like what is Google's differentiated advantage?

Logan:
What does our expertise lend us well to do in the ecosystem?

Logan:
What are the things we shouldn't do because of that? What are the things we

Logan:
should do because of that? It's something I think about as somebody who builds products.

Logan:
One of the things that I always come back to is our infrastructure.

Logan:
And like the thing Google has been able to do time and time again.

Logan:
Scale up multiple products to billions

Logan:
of users, have them work with high reliability, et cetera, et cetera.

Logan:
And that's like a uniquely difficult problem. It's a even more difficult problem

Logan:
to do in the age of AI where like the software is not deterministic.

Logan:
The sort of compute footprint required to do these things is really difficult.

Logan:
The models are a little bit tricky and finicky to work with sometimes.

Logan:
So again, like our infrastructure teams have done an incredible job making that scale up.

Logan:
I think the stat was IO 2024, we were doing roughly 50 trillion tokens a month.

Logan:
IO 2025, I think it was like 480 trillion tokens a month, if I remember correctly.

Logan:
And just a month or two later, and this was in the conversation I had with Demis,

Logan:
we crossed a quadrillion tokens, which comes after a trillion,

Logan:
if you're not haven't thought about numbers higher than a trillion before

Logan:
it's what comes after a trillion um and there's

Logan:
no slowdown in sight and like i think this is just a great

Logan:
reminder of like um so many

Logan:
of these ai like markets and

Logan:
product ecosystems is still so early and there's

Logan:
this massive expansion i think about in my own life like how

Logan:
much ai do i really have in my life helping me like not

Logan:
really that much on the margin it's like you know maybe

Logan:
tens of millions of tokens a

Logan:
month maximum and like you think

Logan:
about a future where there's like billions of tokens being spent on a monthly

Logan:
basis in order to help you in whatever you're doing in your professional life

Logan:
and your work and your personal life whatever it is there's we're still so early

Logan:
and tpus are a core part of that because it allows us to like um control every

Logan:
layer of the hardware and software.

Logan:
Delivery all the way to the actual like silicon that the model is running on

Logan:
and we can do a bunch of optimizations and customizations,

Logan:
other people can't do because they don't actually control the hardware itself

Logan:
and there's some good examples of the things that this enables

Logan:
one of them is um you know we've been at the Pareto

Logan:
frontier from a cost performance perspective for

Logan:
a very long time and again if folks aren't familiar the Pareto frontier is this

Logan:
like trade-off of cost and intelligence and you want to be on the highest intelligence

Logan:
lowest cost um and we've been sitting on that for for you know basically the

Logan:
entirety of the Gemini life cycle um so far which is really important so people

Logan:
get a ton of value from the Gemini models.

Logan:
Another example of this is long context. Again, if folks are familiar,

Logan:
there's a limit on like how many tokens you can pass to a model at a given time.

Logan:
Gemini's had a million or 2 million token context windows since the initial

Logan:
launch of Gemini, which has been awesome.

Logan:
And there's a bunch of research showing we could scale that all the way up to

Logan:
10 million if we wanted to.

Logan:
And that is like a core infrastructure enabled thing.

Logan:
Like research, There's a lot of like really important research to make that

Logan:
work and make that possible.

Logan:
But it's also really difficult on the infrastructure side. And you have to be

Logan:
willing to do that work and pay that price.

Logan:
And it's a beautiful outcome for us because we have the infrastructure teams

Logan:
that have the expertise to do this.

Josh:
Okay, Logan, one quadrillion tokens. That's a big number.

Josh:
We need to talk about this for a little bit because that is an outrageously,

Josh:
mind-bendingly big number.

Josh:
And when I hear you say that number, I think I'm reminded of Jevin's Paradox

Josh:
for people who don't know. it's increased technological efficiency in using

Josh:
a resource which can lead to higher total consumption of that resource.

Josh:
So clearly with these cool new TPUs, this vertically integrated stack you've

Josh:
built, you are able to generate tokens much more cheaply and produce a lot more

Josh:
of them. Hence the one quadrillion tokens.

Josh:
Do you see this trend continuing? Is there going to be a continued need to just

Josh:
produce more tokens or will it eventually be a battle to produce smarter tokens?

Josh:
I guess the question I'm asking is the quality of the token more important than

Josh:
the amount of the tokens?

Josh:
And do you see a limit in which the quantity of the tokens starts to like kind

Josh:
of go off of a cliff in terms of how valuable it is?

Logan:
Yeah, I could buy that story. And some of this is and it's something that's

Logan:
actually super top of mind for our teams on the like Gemini model side is around

Logan:
this whole idea of like thinking efficiency,

Logan:
which is like ideally you want

Logan:
to get to the best answer using the limited amount of thoughts possible.

Logan:
Same thing with humans like ideally like you're the example

Logan:
of like you're taking a test you want to as you know

Logan:
the shortest number of mental hops possible to get you to the answer.

Logan:
Of whatever the question was is ideally what you want you don't want

Logan:
to have to just like think for an hour to answer one question

Logan:
um and there's yeah there's a bunch

Logan:
of odd parallels in that world to like models and

Logan:
and humans doing this approach um so i

Logan:
do think thinking efficiency is top of mind you don't want to just

Logan:
like use tokens for the sake of tokens um i

Logan:
think even if we were to like 10x reduce the

Logan:
number of tokens required which would be like awesome and

Logan:
would be like a great innovation the models are like much more

Logan:
token efficient i do think there's like a um a

Logan:
pretty low ceiling to how far that will be able to go

Logan:
specifically because of this like next

Logan:
token prediction paradigm of like how the models

Logan:
actually approach solving problems using using

Logan:
like the token as a unit so it's

Logan:
not clear to me that you'll be able to just like you know a thousand x reduce

Logan:
the amount of tokens required to solve a problem i think it probably looks much

Logan:
more like 10x or something like that and then the there'll be a 10x reduction

Logan:
in the number of tokens required to solve a problem and there'll be a 10 000

Logan:
x increase in the total amount of ai and and sort of token consumption in the world.

Logan:
So I think you probably, even if we made that reduction happen,

Logan:
I think the graph still looks like it's going up and to the right for the most part.

Josh:
It still keeps going. There is no wall. We have virtual data to train models

Josh:
on. We have tons of new tokens coming into play.

Josh:
There's another question I wanted to ask, which is just a personal question

Josh:
for you, which is a feature that, because I find when a lot of people leave

Josh:
comments on the show and they talk about their experience with AI,

Josh:
a lot of them are just using like ChatGPT on their app or they have Grok on their phone.

Josh:
And I think Gemini kind of has some underrated features that don't quite get enough attention.

Josh:
So what I'd like for you to do is maybe just highlight one or two of the features

Josh:
you shipped recently that you think is criminally underrated.

Josh:
What should people try out that you think not enough people are using?

Logan:
I think the one that continues to surprise me the most is deep research.

Logan:
I think deep research is just like a, is the North Star for building an AI product experience.

Logan:
And if folks aren't familiar with this, so you can show up with,

Logan:
yeah, it's so, you can show up with like a pretty ill-defined question that's

Logan:
like very open and vague.

Logan:
And the model will traverse essentially across the internet hundreds or thousands

Logan:
of different web pages, try to accumulate enough context and then come back to you with initially,

Logan:
basically like a research report, could be like a 40 page report in some cases that I've seen.

Logan:
You might hear a 40 page report and say, that's not very useful to me because

Logan:
I'm not going to read 40 pages.

Logan:
And I'd say you and me are exactly the same because I'm not reading 40 pages either.

Logan:
There's a beautiful feature, again, if you've used Notebook LM,

Logan:
this audio overviews feature.

Logan:
The same thing actually exists inside of the Gemini app with deep research,

Logan:
which you can just press that button and then get like a, you know,

Logan:
10, 15 minute podcast that sort of goes through and explains all the different

Logan:
research that's happened.

Logan:
You can, you know, listen to that on your commute or something like that or

Logan:
on a walk and not need to read 40 pages, which is awesome.

Logan:
The part of this that makes it such an interesting experience to me is,

Logan:
I don't know if other people have felt this before, but most AI products back

Logan:
to that, Josh, that like blank slate problem or that like empty chat box problem.

Logan:
You, you as the user of the product have to put in so much work in order to get useful stuff.

Logan:
I talk to people all the time who are like, yeah, I use these models and like,

Logan:
they're just not useful for me. And like, actually what's happening behind the

Logan:
scenes is the models are super capable.

Logan:
They're really useful. It just requires that you give the models enough context.

Logan:
And I think deep research, there's this new emerging like prompt engineering

Logan:
2.0 is this context engineering problem where it's like, how do you get in the

Logan:
right information so that the model can make a decision on behalf of the user?

Logan:
And I think deep research is this really nice balance of going and doing this

Logan:
context engineering for you,

Logan:
bringing all that context into the window of the model, and then

Logan:
being able to answer what your original question was and principally showing

Logan:
you this like proof of work up front I think about this proof of work concept

Logan:
in AI all the time which is I have so much more trust in deep research because

Logan:
as soon as I kick off that query it's like boom it's already at like 50 web

Logan:
pages I'm like great because.

Logan:
I was never going to visit 50 webpages. Like there's pretty much nothing that I'm researching.

Logan:
I could be going and buying a car and I'm going to go and look at less than

Logan:
50 webpages for that thing or a house.

Logan:
I'm looking at less than 50 webpages. Like I'm just, it's not in the car.

Logan:
At least this is maybe personal to me and other people are doing more research. I don't know.

Logan:
But so automatically I'm like in awe with how much more work this thing is doing.

Logan:
And I think there's, this is, again, this is the North Star from an AI product

Logan:
experience standpoint. and there's so few products that have like made that

Logan:
experience work and just every time I go back to deep research I'm reminded

Logan:
of this and that team uh crushed it so.

Ejaaz:
And it's not just deep research from a LLM context that is so fascinating about

Ejaaz:
Google AI um you guys have created some of the most fascinating tools to advance

Ejaaz:
science and I don't think you guys get enough flowers for what you guys have built.

Ejaaz:
Some of my favorites, AlphaFold 3 is crazy.

Ejaaz:
So, you know, this is this model

Ejaaz:
that can predict what certain molecular structures are going to look like.

Ejaaz:
And this could be applied to so many different industries, the most obvious

Ejaaz:
being drug design, creating cheaper,

Ejaaz:
more effective, curable drugs for a variety of different diseases.

Ejaaz:
And then I was thinking about that random model that you guys launched,

Ejaaz:
where apparently we could translate what dolphins were saying to us and vice versa um

Ejaaz:
Kind of stepping back from all of these examples, can you help me understand

Ejaaz:
what is Google's obsession with AI and science and why you think it's such an

Ejaaz:
important area to focus on?

Ejaaz:
Are we at a point now where we can advance science to infinity or where are we right now?

Ejaaz:
Are we at our chat GPT moment or do we have more to go?

Logan:
I'll start with a couple of cheeky answers, which Demis, who is the only foundation

Logan:
model lab CEO to have a Nobel prize, uh, in the science domain,

Logan:
which is, uh, for him, for him, chemistry, um,

Logan:
had this comment, which is actually really true.

Logan:
There's lots of people talking about this, like impact of AI on science and humanity.

Logan:
Um, and there's very few, uh, if not only one, um, being deep mind research

Logan:
lab, that's like actually doing the science work.

Logan:
And I think it's this like great example of like deep mind and just being in

Logan:
the like culture and DNA, DNA of like Demis is a scientist,

Logan:
all of these folks around, um, around DeepMind are scientists and they like

Logan:
want to push the science and, and push what's possible in this,

Logan:
this future of discovery,

Logan:
um, using our models.

Logan:
And I was in London a couple of weeks ago meeting with Pushmid who leads our

Logan:
science team and hearing about sort of like the breadth of the science that's happening.

Logan:
Um, and how like Dolphin Gemma is like a great, like kind of like funny example,

Logan:
cause it's, it's not super applicable in a lot of cases, but it's interesting

Logan:
to think about, um, alpha fold, like if, if folks haven't, um.

Logan:
Watched the movie, the thinking game, it's about sort of the early days of,

Logan:
of, uh, Google deep mind.

Logan:
And, um, they're talking about like folding proteins and why this is such an interesting space.

Logan:
And I'm not a, not a scientist, um, but the, to like hit on the point really

Logan:
quickly of like why Alpha Fold is so interesting. The the

Logan:
historical like context is humans to fold

Logan:
a single protein would take many humans millions

Logan:
of dollars and it would take on the

Logan:
order of like five years in order to fold a single protein the original impetus

Logan:
and like why demas won the nobel piece the nobel prize for this in chemistry

Logan:
was because deep mind was able to figure out using uh reinforcement learning

Logan:
and other techniques They folded every protein in the known universe,

Logan:
millions of proteins, released them publicly,

Logan:
made them available to everyone.

Logan:
And it was like, you know, dramatically accelerated the advancement of like

Logan:
human medicine and a bunch of other domains and disciplines.

Logan:
And now actually with isomorphic labs, which is part of DeepMind,

Logan:
like actually pursuing some of the breakthroughs that they found and like actually

Logan:
doing drug discovery and things like that.

Logan:
So like overnight, you see that hundreds of thousands of human years and hundreds

Logan:
of millions of dollars of like research and development costs saved through a single innovation.

Logan:
And I think we're going to continue to see that like acceleration of new stuff happening.

Logan:
A recent example of this, Alpha Evolve,

Logan:
which was our sort of like geospatial model that came out and being able to

Logan:
like fuse together all of this, the Google Earth engine with AI and this understanding of the world.

Logan:
Like it's just so much cool science and so much is possible when you sort of

Logan:
layer on the AI capability in all these disciplines. So I think to answer the

Logan:
question, I think we're going to see this acceleration of science progress.

Logan:
I think DeepMind is going to continue to be at the forefront of this,

Logan:
which is really exciting.

Logan:
And the cool thing for even for people who aren't in science is all of that

Logan:
innovation and like the research breakthroughs that happen, it feeds back to

Logan:
the mainline Gemini model.

Logan:
Like we had a bunch of research work about doing proofs for math.

Logan:
And it's like, oh, that's not very interesting at the face value. But like that research.

Logan:
Fuels back into the mainline gemini model it makes

Logan:
it better at reasoning it makes it better able to like understand these like

Logan:
really long and difficult problems um which then benefits like every like agent

Logan:
use case that exists uh because the models are better at reasoning through all

Logan:
these like difficult problem domains so there is this like really cool research to reality,

Logan:
science to like practical impact flywheel that happens at deep mind as.

Ejaaz:
A former biologist this warms my heart. This is amazing to see this get applied at such scale.

Ejaaz:
Okay, we can't talk about Google AI without talking about search.

Ejaaz:
This is your bread and butter, right?

Ejaaz:
However, I've personally noticed a trend shift in my habits.

Ejaaz:
I've used a computer for decades now, and I've always used Google search to

Ejaaz:
find things, Google Chrome, whatever it might be.

Ejaaz:
But I've now started to cheat on this feature.

Ejaaz:
I have started using LLMs directly to do all my searching for me,

Ejaaz:
to get all my sources for me.

Ejaaz:
And you've got to be thinking about this slogan, right? Is this eating the search business?

Ejaaz:
Is this aiding the search business? Or are we creating a whole different form

Ejaaz:
factor here? What are your thoughts?

Logan:
There's an interesting form factor discussion. I think on one hand, the AI...

Logan:
Sort of answer market is definitely distinctly different, it feels like,

Logan:
than the search market to a certain degree.

Logan:
Like, I think we've seen lots of AI products reach hundreds of millions of users

Logan:
and, you know, search continues to be a great business and there's billions of people using it and.

Ejaaz:
All that stuff.

Logan:
There's also this interesting question, which is like, what's the obligation

Logan:
of Google in this moment of this platform shift and all this innovation that's happening?

Logan:
And I, You know, as somebody who doesn't work on search, but,

Logan:
you know, is a fan of all the work that's happening inside of Google and has

Logan:
empathy for folks building these products, it is really interesting.

Logan:
And like my perspective has always been that search actually has this,

Logan:
you know, as the front door to the internet has this stewardship position that

Logan:
makes it so that they actually can't disrupt themselves for the right reasons

Logan:
at the same pace that that sort of,

Logan:
you know, small players in the market are able to do.

Logan:
And my assertion has always been that like, actually, this is the best thing

Logan:
for the world, the best thing for the world and for the Internet and for this

Logan:
entire economy that that Google has enabled through the Internet and bringing

Logan:
people to websites and all this stuff doesn't benefit by like,

Logan:
you know, day one of the LLM revolution happening.

Logan:
All of a sudden it's like a fully llm

Logan:
powered search product and like feels and looks completely different

Logan:
not only i think whether that throw you know users who

Logan:
are still trying to figure out like how do i use this technology what is the

Logan:
way that i should be engaging with it um what are the things that it works well

Logan:
for and it doesn't work for not only to throw those people into a bad um perspective

Logan:
from like a user uh from a user journey but i think it also has impacts on like

Logan:
people who rely on Google from a business perspective.

Logan:
So I think you've seen this sort of like gradual transition and like lots of

Logan:
shots on goal and lots of experiments happening on the search side.

Logan:
And I think we're now getting to the place where like they have confidence that

Logan:
they could do this in a way that is going to be super positive for the ecosystem

Logan:
and is going to create lots of value for people who are going and using these products.

Logan:
Like the understanding of AI technology has increased the adoption

Logan:
and the models have gotten better and hallucinations have gone down and all

Logan:
this stuff and I think there'll be also some like uniquely search things that

Logan:
like only search can do and I've spent a bunch of time with folks on the search

Logan:
team like Robbie Stein as an example who leads all the AI stuff in search and,

Logan:
There's all of this infrastructure that search has built, which as you think

Logan:
about this age of AI, where the ability to generate content,

Logan:
which actually like looks somewhat plausible.

Logan:
Has basically gone to zero. Like it's very easy to do that.

Logan:
Great search is actually more, it's like this premium is like more important than ever.

Logan:
There's going to be a million X or a thousand X or whatever than X number of

Logan:
like growth and content on the internet.

Logan:
How do you actually get people to

Logan:
the most relevant content from people who have authority who

Logan:
have you know expertise and all this stuff it's a

Logan:
really difficult problem and it's like it is the problem of

Logan:
the decade that like search has been solving for the

Logan:
last 20 years and is now a more important problem than ever um so i'm i've never

Logan:
been more excited for for the search team and like i think they've never had

Logan:
a bigger challenge ahead of them as they try to like figure out how to make

Logan:
you know these internet scale systems that they build continue to scale up to

Logan:
solve this next generation of problems,

Logan:
while also becoming this frontier AI product experience where billions of people

Logan:
are experiencing AI for the first time in a different way than they've done.

Logan:
There's so many interesting use cases too,

Logan:
even around like image search is a great example of this like new sort of,

Logan:
it's like one of the fastest growing ways in which people are using search now

Logan:
is showing up with an image and asking questions about it.

Logan:
And just like the way people had just traditionally used search is already changed.

Logan:
It's like different than it was five years ago or even two years ago.

Logan:
And I think we're gonna continue to see that happen. I think search as the product

Logan:
you see today will evolve to have things like multi-line text input fields as

Logan:
sort of user questions change and all that stuff.

Logan:
So there's so much cool stuff on the horizon for search that I'm really excited.

Josh:
Yeah, as I'm hearing you describe all of these cool new things,

Josh:
particularly funneling into a single model, like the science breakthroughs are unbelievable.

Josh:
And I think that's what gets me personally really excited, like Ejaz,

Josh:
is this is actually going to help people.

Josh:
Like this is going to make a difference in people's lives. Right now,

Josh:
it's a productive thing. It's a fun thing. It's a creative thing.

Josh:
There's a lot of tools. But then there's also the science part and a lot of

Josh:
this all funneling down to one amazing model.

Josh:
I think it leaves us in a really exciting place to wrap up this conversation.

Josh:
So Logan, thank you so much for coming and sharing all of this,

Josh:
sharing the news about the new model, sharing all of the updates and progress

Josh:
that you're making everywhere else. I really enjoyed the conversation.

Josh:
For you, you also have a podcast called Around the Problems.

Josh:
Is there anything you want to leave listeners with to go check it out or to

Josh:
check out the new AI studio or the new AI model?

Josh:
Let us know what you have interesting going on in your life.

Logan:
I love seeing feedback about AI studio. So if you have things that don't work

Logan:
that you wish worked, even for both of you, please send them to me.

Logan:
Would love to make things better. for the new model as well.

Logan:
Like if there's, I think this is like still, this is still early days of what

Logan:
this model is going to be capable of.

Logan:
So if folks have feedback on like edge cases or use cases that don't work well,

Logan:
please reach out to our team.

Logan:
Send us examples on Axe or Twitter or the like.

Logan:
Would love to help make some of those use cases come to life.

Logan:
And I appreciate both of you for all the thoughtful questions and for the conversation.

Logan:
This was a ton of fun. We got to do it again sometime.

Josh:
Awesome. Yeah, we'd love to. Anytime. time please come and join us we really

Josh:
enjoyed the conversation so thank you so much for watching for the people who

Josh:
enjoyed please don't forget to uh to like share it with your friends and do

Josh:
all the good things and we'll be back again for another episode soon thank you so much

Josh:
i have a fun little bonus for those of you still listening all the way to the

Josh:
end the real fans when we were first going to record with logan we actually

Josh:
had no idea that he would break the exclusive news of nano banana on our show

Josh:
it was super cool so we wanted to kind of restructure the episode to prioritize that at the front,

Josh:
we did record a separate intro where I said, hey, Google makes some really good stuff.

Josh:
In fact, you guys have an 80 something percent chance of being the best model

Josh:
in the world by the end of this month.

Josh:
Can you explain to us why, why Google is so amazing at what they do?

Josh:
And this was the answer to the question. So here's a nice little nugget for

Josh:
the end to take you out of the episode.

Josh:
Thanks for listening. I really hope you enjoyed and we'll see you guys in the next.

Logan:
My general worldview of like why Google is in such a good place for AI right now.

Logan:
There's many layers of this, depending on sort of what vantage point you want to look at.

Logan:
I think on one hand, it's like, I think search is this like incredible part

Logan:
of this story, which I think people have historically looked at Google search

Logan:
as this legacy Google product.

Logan:
And I think search is going through this transition and is actually like today

Logan:
actually just announced as we're recording this earlier, that AI mode is rolling

Logan:
out to 180 plus countries.

Logan:
English supported right now, and hopefully other languages in the future and is a great example of.

Logan:
Ai overviews and ai overviews sort of double

Logan:
clicking into ai mode being this

Logan:
product that actually like for many

Logan:
people around for for billions of people around the

Logan:
world is the first ai product experience that they

Logan:
actually touch um and i think there's like something

Logan:
really interesting where like google has been on this

Logan:
mission of like deploying ai and like you know there's some

Logan:
you know the the some naysayers on

Logan:
twitter will be like you know google created the transformer and then

Logan:
did nothing with it and it's actually uh very far from

Logan:
the truth which is search has been this like transformer which

Logan:
is the architecture that powers language models and uh

Logan:
and gemini um has been powering that experience with this

Logan:
technology for the last like seven years the product experience maybe

Logan:
looks slightly different today than it did then um but google's been an ai first

Logan:
company uh for as long as i can remember um basically as long as ai has existed

Logan:
that's been the case and now we're seeing more and more of these product surfaces

Logan:
like become these frontier AI products

Logan:
as sort of Google builds the infrastructure to make that the case.

Logan:
I think people also forget like it's not easy logistically to deploy AI to billions

Logan:
of people around the world. And now as you look at like, I think Google has

Logan:
like five or six billion plus user products.

Logan:
So the challenges of like, even just making a small AI product work today,

Logan:
if anyone's played around with stuff or tried vibe coding something like it's not easy,

Logan:
doing that at the billion user scale is also very difficult um

Logan:
so i i continue to be more and more bullish and

Logan:
like part of the thing that allows us to do that billion user scale

Logan:
deployment is the whole infrastructure story like if you're watching on video

Logan:
i don't know if you can see but i have a couple of tpus sitting behind me um

Logan:
yeah and like that tpu advantage which is our sort of equivalent to gpus um

Logan:
is something that i think is going to continue to play out so there's there's

Logan:
so many things that I get excited about,

Logan:
and the future is looking very bright.

Announcing Google's Secret New AI Model With The Person Who Built It | Logan Kilpatrick
Broadcast by