Google Just Made Their AI Free, Private, and Yours (Gemma 4)
Josh:
How much money are you paying to use your AI model? Maybe it's $20 a month,
Josh:
maybe you're on the pro plan for $200 a month, maybe you're running an OpenClaw
Josh:
instance and you're paying thousands of dollars a month to generate tokens from Frontier Models.
Josh:
Google has just released a solution to your problem, something that can be solved
Josh:
for as little as an $80 one-time fee just to run a Raspberry Pi,
Josh:
because that's what this new model runs on.
Josh:
Gemma 4 is a new model from Google that is a hyper-quantized,
Josh:
very small model meant to run locally on devices like your phone or your laptop
Josh:
or even your new MacBook Neo.
Josh:
It's very lightweight and it's built for working offline entirely private and
Josh:
I think the thing that's most noteworthy is how powerful it is.
Josh:
This model that is small enough to fit on your phone and run entirely for free
Josh:
is just as good if not better than some of the Frontier models last year and
Josh:
is even close to performing as well as them this year.
Josh:
Now to showcase this we have some really cool examples that EJS has prepared
Josh:
here so let's get into what this new Gemma 4 model can do.
Ejaaz:
Yeah, I'm super excited about this model. I mean, it's not just one either.
Ejaaz:
There's four of them. And like you said, it ranges from like 4 billion to,
Ejaaz:
I think it's like 50 billion parameter models.
Ejaaz:
Can fit on your phone, can fit on any device. And like you said,
Ejaaz:
eight months ago, this would have been considered frontier intelligence.
Ejaaz:
But I want to get into like what these things can actually do because it's one
Ejaaz:
thing talking about benchmarks.
Ejaaz:
It's another thing talking about what it can do on your phone,
Ejaaz:
on your laptop, what value it can bring to you. This first example is someone
Ejaaz:
leveraging the visual intelligence of these Gemma models.
Ejaaz:
Now, typically, if you're an AI model, you're really good at ingesting words
Ejaaz:
and characters and understanding the word described to you like a book would
Ejaaz:
or like a blog post would.
Ejaaz:
Visual intelligence is a very different frontier that has often been hard to
Ejaaz:
kind of surmount by these new AI models.
Ejaaz:
Gemma does an amazing job. What you're looking at on your screen right now is
Ejaaz:
its ability to identify all the different objects in what is a very crowded room.
Ejaaz:
He raises up a banana and it identifies that.
Ejaaz:
It's also spotting the books that are on his shelf in the back.
Ejaaz:
It's spotting the shelf in itself, the lamp, the fact that he's in a room,
Ejaaz:
the curtains around that.
Ejaaz:
And that's really important when it comes to creating apps that can log your
Ejaaz:
visual experience or can track your calories for the food that you're consuming.
Ejaaz:
And you can build a completely different suite of apps based on intelligence like this.
Ejaaz:
This is the first time that we're seeing it appear in an open source open weight
Ejaaz:
model. And Google's been the first to launch that.
Josh:
Yeah. And if it wasn't abundantly clear, this is totally free.
Josh:
You could just go on the website, download this and run it yourself.
Josh:
And I think looking at this vision example, one of the cool things that I'm
Josh:
thinking of is a lot of people have cameras outside their house,
Josh:
outside their apartments.
Josh:
And this has visual intelligence now to not only see things,
Josh:
but alert you of what it's seeing.
Josh:
One of the cool examples that I saw that we don't have teed up here is just
Josh:
someone who had a little nest cameras right outside the front door.
Josh:
And it would send a notification of what was happening.
Josh:
It's like there is a dog walking in front of your house. There is a man walking
Josh:
up with two packages in his hands. It looks like the package is from Amazon.
Josh:
And it has that visual intelligence that would normally cost quite a bit of
Josh:
money in those tokens using something like Claude Opus or ChatGPT.
Josh:
But instead, it does it all for free on this tiny little model,
Josh:
which is super cool. We have another example that was mentioned in the intro about OpenClaw here.
Josh:
And OpenClaw is something that a lot of people spend a lot of money on.
Josh:
If people are real hardcore users they're spending hundreds of dollars a day
Josh:
up to hundreds of dollars a day some even thousands of dollars a day in addition
Josh:
to buying some pretty expensive hardware to run it on a lot of people bought
Josh:
mac minis you can't get a mac mini if you wanted to because they're so backordered
Josh:
mac studios people were paying hundreds if not thousands of dollars to run.
Josh:
This software on and the reality is is that
Josh:
whatever device you're watching this on, whatever device you're listening to
Josh:
right now, you can run this model on.
Josh:
You don't need something fancy. You don't need a high-end computer to run it.
Josh:
You can just do this on something local, lightweight, and like I mentioned,
Josh:
as lightweight as an $80 Raspberry Pi because you can use the lightest weight model possible.
Josh:
And although the results aren't the best in the world, they are much better
Josh:
than previously expected from these open source models.
Ejaaz:
Yeah, I love that you can finally run OpenClaw on a device that doesn't cost $1,000.
Ejaaz:
These Mac minis actually on the secondary market have gone up sky high.
Ejaaz:
Like the retail price is 800 bucks because you can't get it from Apple anymore.
Ejaaz:
I've seen it as high as like 2K and people are still buying these things.
Ejaaz:
Now, the reason why they've been buying these things is because they can't fit
Ejaaz:
Frontier open source models onto their own mobile phone or their own laptop.
Ejaaz:
And now Gemma 4 has made it super easy to do. So that's amazing.
Ejaaz:
I'm still quite confused as to what the open core users are burning thousands
Ejaaz:
of dollars for on, but like that's probably a topic for another conversation.
Ejaaz:
The other thing that I like about this is Gemma 4 can run completely offline.
Ejaaz:
Now this is a common property and characteristic that you
Ejaaz:
can have for every single open source model but the fact is you
Ejaaz:
have a model here that is near frontier intelligence to say Claude Opus 4.6
Ejaaz:
and GPT-504 and we'll get to those direct comparisons a little later on in this
Ejaaz:
episode but now you can run it offline and the great part about this is often
Ejaaz:
you're in areas where you just don't have internet connection or it takes a while to inference.
Ejaaz:
Now you have it on your phone, you can have it completely offline,
Ejaaz:
it gets access to the world's entire database of knowledge.
Ejaaz:
It might not be real-time, fair enough, but you still get access to core knowledge
Ejaaz:
when you're in a bit of a desperate situation or when you just don't want to
Ejaaz:
use the internet, which I thought was really cool.
Josh:
This part is maybe my favorite, where it feels like you truly have access to
Josh:
intelligence at your fingertips, no matter where you are in the world.
Josh:
You can be stranded on an island. You could have no connection.
Josh:
You can be anywhere at any time. And it is completely and totally locally and free.
Josh:
And it fits on your phone. And it feels like having Google on your phone.
Josh:
I remember growing up, it's like, you're not going to be able to Google everything.
Josh:
You have to learn these things.
Josh:
And the reality is, is that you have a super genius. Now that gets packaged
Josh:
up into something as small as your phone. And that is super cool.
Ejaaz:
Now, naturally, where your mind goes with that property is, huh,
Ejaaz:
if I'm in a desperate situation, can AI save my life?
Ejaaz:
So SkyLevels.io decided to run Gemma 4 locally on his iPhone.
Ejaaz:
And he simulated his scenario of being abandoned in an apocalypse on an island with no help.
Ejaaz:
And he needs to make a fire to keep himself warm.
Ejaaz:
And so he queries Gemma 4 and he asks how to make fire.
Ejaaz:
And the response, I cannot provide the instructions on how to make the fire.
Ejaaz:
So these models are still kind of censored in some kind of way.
Ejaaz:
It's not completely unfiltered. You can't ask it to help you make a biological
Ejaaz:
weapon or do something illegal, which is, I don't think is a problem,
Ejaaz:
but a lot of people who want unsensored versions of these truly open weight
Ejaaz:
models, this isn't exactly that, but still cool nevertheless. less.
Josh:
I'm going to stop you right there because what you just said is not entirely true.
Josh:
Google doesn't want you to do this, but because it is open source, it is open weight.
Josh:
There is a possibility that you can jailbreak it. Someone took it on their own
Josh:
to jailbreak the model to get it to do whatever you would like.
Josh:
And it was just released a few days ago. And it seems as if it works pretty well.
Josh:
It runs on 18 gigabytes of memory, which works for most laptops.
Josh:
And it's totally cracked totally uncensored you can ask it whatever questions
Josh:
you would like and it will give you whatever answers,
Josh:
in return. And I think it's a, it's a testament to the open source community, right?
Josh:
It's like, if you're going to publish these tools, again, they are tools there
Josh:
for the public to use them however they wish.
Josh:
Someone naturally is going to try their best to jailbreak them.
Josh:
Having something like this is actually truly empowering because if you are stranded
Josh:
on the island, you do need to know how to make a fire.
Josh:
This will give you that answer along with some other pretty unhinged answers
Josh:
if you ask, but it will give you the answer.
Josh:
And I think this is an important thing to know is that these models can be jailbroken
Josh:
to be customized when they are open source.
Josh:
And that is in a way a way in which
Josh:
you get the most power from them is you just get them at their
Josh:
purest form without the filters without the censoring it's just true raw intelligence
Josh:
delivered to your phone and i found that pretty interesting but there's also
Josh:
one final example about the powerful smartphone test and what type of smartphones
Josh:
run this the best because not all smartphones are created equal and some do
Josh:
this a lot better than others yeah.
Ejaaz:
I'm a very first world AI problem is you getting annoyed about waiting for the AI to respond to you.
Ejaaz:
I certainly experienced this when I'm using Claude on a very busy day.
Ejaaz:
This test that you're seeing in front of you takes five different mobile phone
Ejaaz:
models and tests Gemma 4 across all of them.
Ejaaz:
So you've got the Gemma models running independently, offline,
Ejaaz:
locally on each of these devices, and they're given the same queries.
Ejaaz:
And you can see that they're very different response rates and generations from these phones.
Ejaaz:
It looks like Apple is the winner in this race, which doesn't surprise me.
Ejaaz:
They have some of the best silicon manufacturing ever.
Ejaaz:
And then I think Google's Pixel phone is the slowest.
Josh:
The OnePlus, I think, beat Apple by like half a second. Google took the slowest,
Josh:
which is very surprising because you would think that Google running their own
Josh:
models would work well, but it turns out they don't have the vertical integration.
Josh:
They don't have the chipset that Apple does. So you could see,
Josh:
yeah, Google took 16 minutes while OnePlus took two and a half minutes and the
Josh:
iPhone took three minutes to run through this test. So
Josh:
it's enough. It's fast enough. We're like, if you are really desperate enough
Josh:
to need local inference like this, it is going to be fast enough to answer the
Josh:
questions that you need in a timely matter.
Ejaaz:
Okay. So what did Google actually launch with these models? We know that they
Ejaaz:
are four models, but let's get into some of the numbers and statistics.
Ejaaz:
So there's four different sizes. And if I bring up this chart over here,
Ejaaz:
you see, we have a 31 billion parameter model, which is the largest and the
Ejaaz:
smallest being a 2 billion parameter model.
Ejaaz:
But the performance across benchmarks is truly very impressive.
Ejaaz:
But going back to the general takes here, up to 256,000 context window,
Ejaaz:
which isn't as large as the Frontier models, which are hitting a million to
Ejaaz:
two million context windows.
Ejaaz:
So you can't put as much information into a single prompt contextually for an AI to understand.
Ejaaz:
You've got native function calling. It can work offline that we mentioned earlier.
Ejaaz:
It's trained on 140 plus different languages.
Ejaaz:
Now, this is something that sounds kind of insignificant, but Google has done
Ejaaz:
something really well here.
Ejaaz:
They released a translation feature, I believe, last week, which can translate
Ejaaz:
a similar number of languages live in real time as you're talking and listening to someone.
Ejaaz:
It directly translates into whatever listening device that you have.
Ejaaz:
So I think this is super cool to see this run on a locally open source model.
Ejaaz:
And it's commercially permissive. So it has an Apache 2.0 license,
Ejaaz:
which means that you can kind of take it and use it for whatever you want,
Ejaaz:
build any apps on it. And I don't think it becomes a problem unless you get
Ejaaz:
over a certain number of users, if I'm not mistaken.
Josh:
Yeah, the 2 billion and 4 billion, they're the ones that fit on your phone.
Josh:
And you could think of those kind of like, if you think of these models like
Josh:
engine sizes, those are kind of like the bicycles, right? It's their,
Josh:
they're pretty lightweight, maybe a motorcycle.
Josh:
And then the larger ones, the 26 billion, the 31 billion, those are like the
Josh:
V12 engines. Those are the powerhouses.
Josh:
Those are the two models that run on the 256k token window.
Josh:
The others run on 128k. So you're not going to be having very long conversations
Josh:
with these models that are on your phone,
Josh:
they have the ability to run and do so multimodally. One of the most interesting
Josh:
things is even these very small models that fit on your phone,
Josh:
they support not only text, but images and audio as well.
Josh:
And having the audio thing is pretty cool because it understands and interprets
Josh:
audio. And that is a pretty powerful thing to have on this tiny little model.
Ejaaz:
I also had the question as to how this model compares to the other top open
Ejaaz:
source models. Now, it's no surprise on the show, we've highlighted them a lot.
Ejaaz:
China has been leading the frontier here. If you look at this chart,
Ejaaz:
Gemma, both the 31 billion and the 4 billion parameter model,
Ejaaz:
does really well when it compares to ELO scores.
Ejaaz:
So if you look on this chart, for the amount of intelligence per square density,
Ejaaz:
which isn't an official stat, but it's one that I'm created on the show for
Ejaaz:
the last couple of episodes, Gemma absolutely crushes it.
Ejaaz:
It's on the top left over here, scoring extremely highly, but with a very small parameter count.
Ejaaz:
Now, if you compare it to the other leading open source models like KimiK 2.5
Ejaaz:
Thinking, they're well over the limit of a trillion parameter model.
Ejaaz:
You are Quen and GLM-5 closely behind that. So although Gemma isn't as smart
Ejaaz:
of them, they're close enough.
Ejaaz:
It looks like they're about 99% of the intelligence when it comes to ELO scores,
Ejaaz:
but at a fraction of the size, which is why you're able to run it on your phone.
Josh:
Yeah, they're kicking ass. I mean, China still has, in terms of pure intelligence,
Josh:
they're still winning the race.
Josh:
But in terms of intelligence density, intelligence per token, it's really high.
Josh:
And I think one of the cool things that they did with Gemma 3 versus Gemma or
Josh:
Gemma 4 versus Gemma 3 is they gave it the Apache license as well, the Apache 2.0 license.
Josh:
And basically what that means is that previously a lot of these were restricted
Josh:
and they were limited to enterprise adoption.
Josh:
This is total freedom to modify, redistribute, commercialize with no restrictions.
Josh:
You can use it for whatever you want. You can repurpose this in any way you wish.
Josh:
And having it built in with the 140 languages, like you mentioned,
Josh:
and the multimodality, this is kind of like a home run. And when you look at
Josh:
this chart, it also shows the same story.
Josh:
Gemma 4 versus the world, comparing these to all the other Chinese models. This is a heavy hitter.
Ejaaz:
Yeah, yeah. I mean, if we look at some of these benchmarks, software engineering,
Ejaaz:
it, okay, listen, it's not number one, it's 68%. I believe Opus 4.6's score
Ejaaz:
on this is in the high 80s.
Ejaaz:
So we're not talking about frontier intelligence when it comes to coding models,
Ejaaz:
you're not ditching Claude code for something like this.
Ejaaz:
But when it comes to generalized intelligence, when you're replacing your Google
Ejaaz:
queries with an LLM, and you don't want to spend 20 bucks per month,
Ejaaz:
or 100 bucks per month on a Claude subscription or GPD 5.4 subscription,
Ejaaz:
you can just use this and you can run it locally and offline
Ejaaz:
privately train on your own data it is incredibly cool
Ejaaz:
i had the same question to compare it to the frontier models
Ejaaz:
because i wanted to give this a fair shout there are some potentially exaggerated
Ejaaz:
uh stats here josh if i had to be honest here i'm looking at how it weighs up
Ejaaz:
okay if we look at software engineering which we just mentioned we're right
Ejaaz:
it's it's almost 12 points lower than claude opus 4.6 which is the leading Frontier model, not great.
Ejaaz:
But at some of these other benchmarks, AIME 2026, it is near Frontier as well
Ejaaz:
as GPQA Diamond and MMLU Pro.
Ejaaz:
Do you think these things are gamed or do you think this is an accurate take?
Josh:
Yeah, all the benchmarks are gamed. And I think the only real way you could
Josh:
test this is by running against your own use cases that you want and just evaluating
Josh:
for your own, because it absolutely is not 90% frontier capable when you converse with it.
Josh:
Like when you talk to Gemma versus Opus 4.6, there is a very stark and clear
Josh:
difference between the, I guess the EQ and the IQ, where one feels much more
Josh:
naturally human, much more is very, one is very dry.
Josh:
Perhaps on these benchmarks, Gemma is 90% of the way there.
Josh:
But in actual practice, when you're using the model on day-to-day life, it is nowhere close.
Josh:
At least that is my perspective just from trying these things out.
Josh:
And I think we have to take these kind of benchmarks with a grain of salt because
Josh:
they're gamed on very specific things.
Josh:
And if you change the parameters of these benchmarks a little bit,
Josh:
or you change the actual structure of the benchmark,
Josh:
it won't perform well because to some extent, these models are kind of baked
Josh:
in with the expectation that they're going to need to perform well on these
Josh:
benchmarks and therefore are optimized for these specific types of problems
Josh:
versus general real world use cases that someone like us is going to use every
Josh:
day or someone who's using open claw actually wants the tokens generated from.
Ejaaz:
If cost is a determining factor in your decision to use one AI model over the
Ejaaz:
other, Gemini might be quite a convincing bet.
Ejaaz:
It is a fraction of the cost. I know it says it's $0.08 per million tokens. It's actually $0.03.
Ejaaz:
I think we maybe had a bit of an issue generating this particular stat.
Ejaaz:
The point is, it's incredibly cheap versus the Frontier models.
Ejaaz:
4.6, you're looking at $10 blended input output tokens for a million tokens.
Ejaaz:
So if you're one of those OpenClaw users that we mentioned earlier that are
Ejaaz:
using this for myriad different use cases and are burning thousands of dollars
Ejaaz:
per day or per week doing your different use cases, this might be a better bet.
Ejaaz:
It might be a better trade-off for you to use. And I also want to remind everyone,
Ejaaz:
a very important reminder, which is eight months ago, this model or these models
Ejaaz:
from Google would have been considered Frontier.
Ejaaz:
So it's amazing how much advancement that we've made in eight months.
Ejaaz:
Now, I know in those same eight months, we've also got bigger and better models
Ejaaz:
from the Frontier Intelligence Labs, the question does ring in my mind,
Ejaaz:
which is, will open source ever catch up?
Ejaaz:
If I'm being honest, I thought open source would have died a year ago,
Ejaaz:
but it's still being able to keep up.
Ejaaz:
Now, part of that is because Chinese models or Chinese AI labs have invested
Ejaaz:
so much in keeping up with the US labs.
Ejaaz:
They've also done distillation attacks and all those other kinds of things.
Ejaaz:
But the fact that Google themselves, who haven't done any of those things,
Ejaaz:
have put out an open source model near as good as the Frontier models,
Ejaaz:
gives me a lot of hope that open source is here to stay
Josh:
Yeah i don't see a world in which this slows down and
Josh:
i really love the trailing progress we get because at some point
Josh:
we're going to reach the tail end of diminishing returns in which
Josh:
open source models are just capable enough to do everything the average person
Josh:
wants what we currently have right now is a problem that we're running up against
Josh:
in terms of frontier ai labs where the new models just cost too much money like
Josh:
opus has or claude has capybara the new model ready to go it's just i mean aside
Josh:
from it being too dangerous it's just far too expensive.
Josh:
The amount of GPUs that are required to generate tokens from these models is so high.
Josh:
And if you want frontier intelligence, the cost really is kind of creeping upwards instead of.
Josh:
The tail end of that becomes very commoditized quickly it's
Josh:
like the very very highest end the stuff
Josh:
that's going to be solving new math and new science costs a
Josh:
tremendous amount of money but the open source that's maybe six months
Josh:
behind costs zero dollars so the delta is huge and if you're not interested
Josh:
in solving these like unbelievably complex problems or writing really high quality
Josh:
code then the amount of problems that these open source models are going to
Josh:
be able to solve a year from now when they're better than opus 4.6 is today
Josh:
that's going to be a really large amount And it begs the question is,
Josh:
who is actually going to want to continue to pay for these frontier models if
Josh:
they are that expensive to run their things like OpenClaw, when the reality
Josh:
is that these open source models,
Josh:
maybe Gemma 5, maybe Gemma 6, will be able to tackle almost all of the problems
Josh:
that we have. And I don't know, it's an interesting...
Josh:
Thought experiment. But I think it is certain that open source is certainly
Josh:
here to stay, particularly as it relates to China and the United States going
Josh:
forward with this AI race, because this is a pretty nice jab at the Chinese open source models.
Ejaaz:
Yeah. And if you've been a listener of this show, you'll know that my thoughts
Ejaaz:
on the future of AI is very much AI agents, specifically personal agents that
Ejaaz:
work for you and are trained on your own personal data.
Ejaaz:
Now, if you're the average person, you probably don't want to give open AI ananthropic
Ejaaz:
access to your personal data so that they can train their own models.
Ejaaz:
That's a breach of trust in many different extents.
Ejaaz:
Locally run open source models might be the solution for that.
Ejaaz:
They may not be as smart, but if they're trained on your data,
Ejaaz:
they could unlock a new level of intelligence which centralized models can't do.
Ejaaz:
And so I'm optimistic that Gemma 4 and a bunch of other open source models that
Ejaaz:
have come from either Chinese AI labs or the ones that are going to be released
Ejaaz:
in the future will be able to do that.
Ejaaz:
The other trend that I think is pretty clear is locally run models, right?
Ejaaz:
Models that you can run on your device specifically that doesn't necessarily
Ejaaz:
need to be trained on your data, but are local to you.
Ejaaz:
The reason why it's so important is it's cheaper. You can run it privately.
Ejaaz:
And also it gives you the ability to get quicker prompts or quicker queries.
Ejaaz:
It runs seamlessly and you don't have to wait. You don't have to rely on servers going down.
Ejaaz:
You don't have to rely on a centralized data center running your compute.
Ejaaz:
You could just have it all locally on your phone.
Ejaaz:
Those things sounding significant, until you have an app that runs locally on
Ejaaz:
your device, which I think would be super cool to see. And I want to see more
Ejaaz:
of these types of things happening.
Ejaaz:
I think personally, Apple is going to be the frontier company that leads us
Ejaaz:
into this kind of world because they have the biggest distribution.
Ejaaz:
They have like 3 billion active devices. I would love to run the model on my Apple iPhone right now.
Ejaaz:
So I think that's a trend that we're going to see. And I think open models are
Ejaaz:
the only way to unlock it.
Josh:
How cool would that be? We get WWDC coming pretty soon. We're going to be covering that on the show.
Josh:
But that's going to be the Super Bowl for Apple. We're going to see,
Josh:
this is what, two years after they fumbled Apple intelligence.
Josh:
We're going to see what their new plans are this year.
Josh:
So I'm really excited to see their take on this because like you mentioned, it's really powerful.
Josh:
And I think most people listening to this probably never ran a model locally
Josh:
on their machine, but I feel there's something very empowering to it.
Josh:
If it's not just for generating your own intelligence, it's for the privacy
Josh:
aspect of it, where you know none of the information that you're sharing is
Josh:
getting leaked out to any servers.
Josh:
No one's training on it. It's all yours to own for yourself.
Josh:
And there's something really nice about that. And I think the final thing we're
Josh:
going to talk about on this episode is why on earth Google would give this away?
Josh:
Because it seems like Google's doing really well. They just signed a deal with
Josh:
Anthropic for their TPUs. They have Gemini, which is a powerhouse.
Josh:
They have the best world models, video models. They have amazing image gen.
Josh:
Why are they giving away this sauce? EJS, do you have any idea?
Ejaaz:
I don't have a great idea, but I have some thoughts. And I have one that argues
Ejaaz:
in favor of them doing this and one that argues against it.
Ejaaz:
The one that argues in favor of them is the Android example,
Ejaaz:
which is they open sourced the entire thing.
Ejaaz:
They allowed anyone and everyone to hack away at different apps and launch it
Ejaaz:
on their Play Store, whatever that might be.
Ejaaz:
And they gained a lot of mind share and market share by doing this.
Ejaaz:
Now, is it as well curated and beautiful as iOS and the Apple App Store.
Ejaaz:
Most people will probably argue not, but the point is they have one of the largest
Ejaaz:
distribution modes because of this.
Ejaaz:
I think this might be an example of them getting Google AI, not just a specific
Ejaaz:
model, but Google AI in the hearts and minds of everyone.
Ejaaz:
And if they could tap into the locally run device audience, that could be a big win for them.
Ejaaz:
Now, the argument against that is, dude, you could have been using all this
Ejaaz:
compute to train a better Gemini model and keep up with the Frontier AI Labs.
Ejaaz:
And that's all that matters. Build a better coding model because right now it kind of sucks.
Ejaaz:
And you can then build all of this other open source stuff later.
Ejaaz:
The number one primary race to win is best model and currently you're losing.
Ejaaz:
So I don't know. Do you have the same take?
Josh:
Yeah, that's probably right. I imagine it's for a mixture of reasons.
Josh:
One of them is probably to also feed into their cloud flywheel because we're
Josh:
talking about running these models locally, but how many are actually running these models locally?
Josh:
And for the ones that are, how many are going to quickly run up against ceilings
Josh:
because they want to do more and more and more?
Josh:
And then eventually they'll just migrate over to the more powerful models and
Josh:
use probably the Google Cloud services.
Josh:
And I think there's a lot of reasons to become the infrastructure.
Josh:
The Android example is a great one. Feeding the cloud flywheel is another strong one.
Josh:
And I think this is just a really small side quest for Google in terms of optimizing
Josh:
for that intelligence per bit, whatever we're going to kind of coin that as,
Josh:
but the intelligence density of a model.
Josh:
This has the highest. This is much more than Gemini. And it's a fun practice
Josh:
as they move forward to these new models of intelligence compression per token.
Josh:
And if they could continue to learn and then publish those learnings and then
Josh:
just keep iterating on that front, I think that's a huge win for Google and also everyone.
Josh:
Google's just doing a nice public service announcement, a nice little public goods.
Josh:
And the team there is doing really cool things with it. Logan Kilpatrick is
Josh:
one person, for example, who is running the Google AI Studio team.
Josh:
They have been publishing all of these models, making them super easy to use
Josh:
through the Google AI Studio.
Josh:
So if you just go there, you can play with the two larger models and just kind
Josh:
of see how they compare to,
Josh:
something like Gemini 3.1 Pro, and then see if you want to make your own decision
Josh:
to run these things locally or just go start pinging some apis or just use your
Josh:
20 a month plan that you have with anthropic or chat gpt but i think that is the gemma for.
Josh:
Episode we got it all covered it's an amazing model
Josh:
it's available for free to run locally on whatever
Josh:
machine that you wish because it is lightweight enough to fit
Josh:
on an iphone or a raspberry pi and it's cheap enough to
Josh:
run it for free if you download these things on your devices you have free inference
Josh:
forever you can run it 24 7 on whatever tasks you want and it will cost you
Josh:
only the amount of the electricity to power the machine and i think that's pretty
Josh:
cool and i'm glad google is really stepping on the plate with probably the leading
Josh:
usa frontier open source model. And that's pretty cool.
Ejaaz:
Yeah. And I'm curious what you, the listeners and watchers of this show,
Ejaaz:
think yourselves. Like, go out and try this thing.
Ejaaz:
If you don't want to download it, you can get access to it by Google AI Studio.
Ejaaz:
Give it a few queries. Like, does it match up to your experience with Claude 4.6 and GPT 5.4?
Ejaaz:
Would you replace your $20 to $100 a month subscription with something like this?
Ejaaz:
Let us know in the comments or DM us on our socials. Our X profiles are linked below as well.
Ejaaz:
And yeah, that's pretty much it. I'm going to be trying out these models it
Ejaaz:
is definitely the best ai frontier open source
Ejaaz:
model but i have to say compared to the chinese models they're still
Ejaaz:
kind of like leagues ahead right now um i hope we see more adoption of open
Ejaaz:
source models going forwards um and when that eventually happens if there's
Ejaaz:
a new open call breakthrough you will hear it first here on this show we also
Ejaaz:
did a cool episode covering some of the chinese uh open source models that were
Ejaaz:
lately released uh last week definitely go check that episode out as well.
Ejaaz:
But aside from that, if you aren't subscribed to us, please do.
Ejaaz:
It helps us out a lot. Turn on notifications.
Ejaaz:
Even if you're listening to us on Spotify or Apple Podcasts,
Ejaaz:
give us a rating, give us a review. It helps us out massively.
Ejaaz:
Josh, is there any other parting words that you want to give?
Josh:
Don't forget to share it with your friends and we'll see you guys on the next episode.
Ejaaz:
Yeah, see you guys.
Creators and Guests
