Revealing Elon’s Secret AI Trading Bot: Is It Worth It?
Josh:
Imagine this, you give eight of the world's most powerful AI models $10,000
Josh:
each and tell them, go trade real stocks.
Josh:
No paper trading, but real money with real risk. And two weeks later,
Josh:
most of them have lost a painful amount of cash, which I guess is kind of expected.
Josh:
The kind of drawdowns that would get a human portfolio manager totally fired.
Josh:
But now, they ran the same experiment again, except this time with much higher stakes.
Josh:
There's $320,000 at stake. And we've talked about Alpha Arena before in a previous
Josh:
episode, which I highly recommend checking out.
Josh:
But now we have the new results from the new season, season 1.5.
Josh:
And what was exciting is that there was a very clear and obvious winner,
Josh:
but that winner was a mystery.
Josh:
We don't actually know or we didn't know who the winner was up until recently.
Josh:
In fact, it won all four of the trading competitions in
Josh:
this new season while leaving the other top models like ChatGPT
Josh:
5.1 and Google Shemini 3.0 fighting for
Josh:
second place so at the core of this is one who is
Josh:
this model and two how on earth did they
Josh:
do it how are they outperforming everyone so much so as to make 65 percent in
Josh:
two weeks in one of these competitions so ijaz i want to walk through everyone
Josh:
about what what just happened what the model is and what alpha arena is so give
Josh:
us the lowdown on on who this was that made so much money oh
Ejaaz:
Yeah well we will get into all of that uh today
Ejaaz:
So Alpha Arena is basically a competition or test to see how well AI models can trade.
Ejaaz:
And they do this in a few different ways, Josh. Number one, they give each model
Ejaaz:
$10,000, as you mentioned.
Ejaaz:
And then they allow them to trade a range of different financial instruments
Ejaaz:
over a period of two weeks.
Ejaaz:
So there's like a season, two weeks, and we see which AI models do the best.
Ejaaz:
And they get all your AI models in there. You've got ChatGPT,
Ejaaz:
you have got Gemini, you've got Anthropics Claude, and you have Grok as well.
Ejaaz:
And so they've gone through about two seasons now, and the results have been
Ejaaz:
absolutely crazy. So they started off with season one.
Ejaaz:
And you can think of this as like the degen crypto season.
Ejaaz:
They gave seven models, $10,000 each, and allowed them to trade crypto assets
Ejaaz:
like Bitcoin, Ethereum, stuff like that.
Ejaaz:
And they did this in something called Perpetual, so they could leverage trade.
Ejaaz:
It was the only instrument that they were allowed to do this.
Ejaaz:
And the results were, as you'd probably expect, a lot of these AI models lost a lot of money.
Ejaaz:
Some of them actually ended up making a decent chunk of money,
Ejaaz:
and they were primarily Chinese models.
Ejaaz:
They were Quen, and I think it was DeepSeat that ended up making money.
Ejaaz:
So there was a lot of takeaways there. As you mentioned, we've got a previous
Ejaaz:
episode where we spoke about this.
Ejaaz:
Definitely go give that a watch. There's a lot of alpha in that one.
Ejaaz:
And then that brings us to season 1.5, where the AI models, instead of being
Ejaaz:
given crypto to trade, were given the ability to trade U.S. stocks.
Ejaaz:
And we're talking about equities, which is something that a lot of us listening
Ejaaz:
to this show are very familiar with. And I think this is for a few reasons, Josh.
Ejaaz:
Primarily, crypto is very volatile, and we kind of want to figure out how the
Ejaaz:
majority of money that is traded in the financial markets can translate into
Ejaaz:
AI models trading that. So a few things that they kept the same is that they
Ejaaz:
gave the AI model $10,000.
Ejaaz:
But there was a number of differences with Season 1.5. Number one,
Ejaaz:
they were allowed to trade US equities and stocks.
Ejaaz:
Number two, there were two new models that were introduced.
Ejaaz:
One was a model called Kimi K2, which is a really good open source Chinese model.
Ejaaz:
But the other was this thing called a mystery model.
Ejaaz:
I'm going to reveal which this model was in a second. But before I do,
Ejaaz:
do you have any guesses as to what model this might have been?
Josh:
Well, I cheated. I know the answer. But what I think is very exciting about
Josh:
this is that like, I think it's important to highlight these models made hundreds
Josh:
to even thousands of trades per model. Yes.
Josh:
And what we want to answer, like the question that I want more than this mystery
Josh:
model is like, is this real signal or is this just, I mean, Luke said earlier,
Josh:
is this a GPU intensive scratch off game?
Josh:
Where is there any real signal? and like I guess we'll talk about the reality
Josh:
of that and what this means for your portfolio if you ever want to manage it but to me
Josh:
I think that's the important thing to highlight. We probably should just spill
Josh:
the beans, EJ. Do you want to just tell them? Who's Mr.
Ejaaz:
Model? I can't keep it in any longer. It was an unofficial version of Grok,
Ejaaz:
aptly named Grok 4.2 or 4.20 for the memers out there.
Ejaaz:
And this was revealed by none other than the Grok man himself, Elon Musk.
Ejaaz:
And the reason why this mystery model was getting so much attention,
Ejaaz:
Josh, was because it ended up being the winner. It made the most money out of any other AI models.
Ejaaz:
And what was more impressive is there wasn't just one competition being run throughout season 1.5.
Ejaaz:
There were four at the same time. So these AI models were running across four
Ejaaz:
different competitions at the same time. That was $320,000.
Ejaaz:
At any one instance, which is a crazy amount of financial money to stake on
Ejaaz:
an experiment. That's a lot of money could have been lost here.
Ejaaz:
And Grok 4.20 ended up performing the best.
Ejaaz:
Josh, I want to go through a few different stats here, which kind of like shows
Ejaaz:
how amazing this particular model was.
Ejaaz:
So firstly, for some context, there were four different competitions that were
Ejaaz:
being run that these AR models were being tested on.
Ejaaz:
Competition number one was something called new baseline. This is basically
Ejaaz:
the ability for these AI models to get access to.
Ejaaz:
Trading AI stocks, to get access to all the common news that you and I can read
Ejaaz:
online and in newspapers to kind of like figure out, okay, what kind of news
Ejaaz:
would affect my stock positions.
Ejaaz:
They would also get access to sentiment data to see how kind of like the markets
Ejaaz:
and retail traders would kind of react to certain bits of news.
Ejaaz:
They had access to a much wider spread amount of data in competition number one.
Ejaaz:
Competition number two was called Monk Mode. They kind of amended the investing
Ejaaz:
prompt here. And so kind of like they traded more conservatively.
Ejaaz:
Competition number three was called Situational Awareness, Josh.
Ejaaz:
So each model had an awareness of other models trading and where they ranked in accordance to them.
Ejaaz:
So there was this kind of like ecosystem of peer pressure being put on by each model.
Ejaaz:
And competition number four was just outright degeneracy max
Ejaaz:
leverage you could only trade with like 20 to 50x leverage which is just kind
Ejaaz:
of i don't think it's 50x but like 30x uh just crazy amount of risk um adjustment
Ejaaz:
to test whether a model would take that risk or whether it would trade more
Ejaaz:
conservatively josh did you have any reactions on the the results of this of this competition the.
Josh:
Results that we're looking at right now actually i found most interesting this
Josh:
is from the new baseline competition it's It's basically the full info mode.
Josh:
And one of the big differences between this mode versus previous competitions
Josh:
that have been held is like you mentioned earlier, it has access to a lot of data.
Josh:
This is the first time an AI trading model has had access to real time information
Josh:
outside of just looking at a chart. So I think.
Josh:
In that sense, this is the closest competition to how a human quant fund would actually operate.
Josh:
So if you're looking for high signal in terms of which AI can actually make
Josh:
you real money in the real world, this is the one.
Josh:
And what we're seeing here is that the Grok 4.20 model, the memetic mystery
Josh:
model, outperformed by a fairly large margin to OpenAI and ChatGPT 5.1,
Josh:
which is the clear second place.
Josh:
And those are the only two that actually made profit. everybody else lost money
Josh:
in the real world competition which to me signals a few things one of them being
Josh:
well perhaps one is really good at
Josh:
understanding real world information perhaps it understands company fundamentals
Josh:
better perhaps it just has access to real
Josh:
world information that's better like grok and having access to the x ai model
Josh:
um so there's a lot of things to speculate here but for me the new baseline
Josh:
chart that we're looking at right now was the high signal one i'm like oh my
Josh:
god wait this has the same type of information flows that i'm now getting so
Josh:
now we're even we're on the same playing field okay
Ejaaz:
Um i actually had a different answer to that which is i was more impressed,
Ejaaz:
Josh, by the situational awareness competition.
Ejaaz:
So this was a competition where each model had access to data and news,
Ejaaz:
but they also had awareness of who they were competing against.
Ejaaz:
So Grok 4.20, the winner, knew that GPT-5 was in second place.
Ejaaz:
And so he was always keeping an eye on GPT-5, being like, oh,
Ejaaz:
what trades is GPT-5 making? Why did they make that trade? Oh, that's interesting.
Ejaaz:
And then he would look at Gemini and be like, oh, what trades are Gemini making.
Ejaaz:
So he would have this awareness of his competitors, which you didn't have in
Ejaaz:
season one, where they were just kind of like trading in silos, right?
Ejaaz:
And why this competition was so interesting, Josh, is this was technically where
Ejaaz:
Grok 4.20 made the most money.
Ejaaz:
In fact, if you look at the top of this leaderboard right here,
Ejaaz:
the account value at the end of season 1.5 was $16,656,
Ejaaz:
which is technically a 60% plus return in two weeks on $10,000 worth of capital.
Josh:
I needed to take my money immediately.
Ejaaz:
Isn't that insane, right? Like if you had to pick a competition of where you
Ejaaz:
would have given an AI model money, just given from this data,
Ejaaz:
and I'm not saying you should do that, you would be most bullish on situational awareness.
Ejaaz:
And I'm going to make some implications here that I haven't tested yet,
Ejaaz:
but it seems to imply that
Ejaaz:
this kind of competitive nature where the models were kind of aware and exposed
Ejaaz:
to their competitors' trades and thinking, and we're going to get to the model
Ejaaz:
chat thinking in a second, seems
Ejaaz:
to have given them a better trading advantage, at least in some cases.
Josh:
Yeah, so like you mentioned, one of my favorite parts, I think we share this
Josh:
in one of our favorite parts about this competition in particular,
Josh:
is that you can actually see all of the trades.
Josh:
One thing about these private quant funds, you don't know what the hell is going on.
Josh:
But with So these models, you can see exactly what they're thinking every time
Josh:
they think and make a decision.
Josh:
So maybe you guys can go through a few of them and see kind of what the model
Josh:
is thinking, how they're processing this real world data.
Josh:
And if there's any tips for us to learn from processing this real world data,
Josh:
because clearly they're a much better trader than I am.
Ejaaz:
Yeah. So I have a few examples pulled up here on the right side of the screen.
Ejaaz:
It's under model chat. By the way, any of you listening to this can go onto
Ejaaz:
this website and see for yourself and scroll through their hundreds and hundreds of posts.
Ejaaz:
But it basically gives us an insight into how each model thinks about a trade
Ejaaz:
that they currently either have open or they're thinking about opening or closing
Ejaaz:
or whatever that might be, right?
Ejaaz:
So it's like being in the mind of an actual investor and figuring out how they make their decisions.
Ejaaz:
An example here at the top of the screen is Gemini 3 Pro.
Ejaaz:
He goes, I'm betting on a breakout in NVIDIA, seeing a strong setup as it holds
Ejaaz:
support and leading the market with a target of $189 and a stop just below $180.
Ejaaz:
So what he's referring to there is kind of a typical quant style of trading
Ejaaz:
where it's kind of like he's looking at technicals, he's evaluating kind of
Ejaaz:
graphs, momentum of the stock price.
Ejaaz:
It's very price evaluated type of trading, right, Josh? But if you look just
Ejaaz:
below it, you've got GPT 5.1, which actually came in second at the end of this
Ejaaz:
competition, who goes, my analysis indicates continued strength in AI names
Ejaaz:
like NVIDIA and Microsoft.
Ejaaz:
So I'm holding out on existing long positions over the weekend and potential macro event risk.
Ejaaz:
Now, the point I want to make about this particular model is it's less price
Ejaaz:
specific and it's more focused on just kind of general themes,
Ejaaz:
news and data that it's seeing outside of price.
Ejaaz:
And that really goes to demonstrate that some of these models are very kind
Ejaaz:
of price and quantitative focused, whereas other models are kind of more thesis
Ejaaz:
driven over a shorter period of time.
Ejaaz:
And it kind of gives rise to these types of personalities, right, Josh?
Josh:
Yeah, well, now we have to answer the uncomfortable question is like,
Josh:
is this evidence that Grok is some kind of money printing god?
Josh:
Or is this just like really well produced content that happens to involve real money?
Josh:
And that kind of comes down to understanding the AI, understand the personalities,
Josh:
understanding how each model considers these trades and how they place themselves
Josh:
in different positions.
Josh:
So I kind of want to go through one by one, all of the models and kind of what
Josh:
their personalities are like.
Josh:
We see with DeepSeek a lot that it behaves, and we mentioned on a previous episode
Josh:
as well, it behaves like a very disciplined quant fund.
Josh:
And DeepSeek, for those that don't know, it's an open source Chinese model.
Josh:
They are very systematic, very mathematic, very comfortable with leverage,
Josh:
but able to hedge and adjust mid-trade based on its decisions and new information.
Josh:
So DeepSeek and Quen even is kind of similar to this.
Josh:
If you remember from the last episode, Ejaz, Quen was my early favorite.
Josh:
I had hoped that Quen was going to win.
Josh:
Unfortunately, that's not the case at all in season 1.5. Quen has gotten crushed
Josh:
right there with DeepSeek.
Josh:
I can kind of imagine it as like more similar to me, maybe that's why I resonated
Josh:
with it, where it has one big thesis and then it sizes aggressively around that thesis.
Josh:
So if you remember, Quen would only buy Bitcoin or Ether in the last one and
Josh:
it wouldn't buy any other altcoins.
Josh:
It just had a thesis that these major coins were going up, nothing else was.
Josh:
Claude is interesting. It's very
Josh:
reflective of how the actual Claude model works when you engage with it.
Josh:
It's very patient and it's thoughtful, but it occasionally sizes up too much
Josh:
and then it gets crushed by leverage.
Josh:
So, and like, as we go through these, and EJs, I also noticed you assigned a masculine...
Josh:
Personality to gemini you said he when you were talking about google gemini
Josh:
and that's kind of because it's it's daddy right like gemini's been
Josh:
the big boy on top but but in
Josh:
this training competition i don't know if it is i was going through the trades
Josh:
and it very much panic flip flops from shorts to long after losing and it kind
Josh:
of in a way gemini was most reflective of retail behavior because and i'm not
Josh:
sure what we could tie that to but gemini was very reactionary where if it lost
Josh:
money it would flip its position and if it gained money it would it would kind of hedge quickly.
Josh:
So that was interesting. And then we have GPT-5, which is very sophisticated reasoning.
Josh:
But in season one, they over-traded and over-leveraged and got absolutely wiped
Josh:
out. And they were very timid in their way that they went about this.
Josh:
So that's kind of how you can think about these.
Josh:
The final one, which is the secret model, Grok 4.2. If we know anything about
Josh:
Grok, we know that it is a very high risk taker, but a calculated risk taker.
Josh:
And that's probably what put it at the top there. So that's kind of how I would
Josh:
consider all of these models.
Josh:
They're a little different and they are reflective of, if you've used these
Josh:
in person, you could kind of understand the thinking that gets placed behind the trades
Ejaaz:
Yeah i i want to dig into a few
Ejaaz:
things around the the personality or rather the trading styles here josh because
Ejaaz:
um it may not be as explicit as we kind of lay it out like so grok 4 4.20 was
Ejaaz:
the winner right by far and it made money uh it was the top across all of the
Ejaaz:
competitions all four competitions that's great but did you look at the results of grok 4,
Ejaaz:
its predecessor.
Josh:
It was absolutely crushed.
Ejaaz:
It was the worst performing model in this entire competition,
Ejaaz:
which is crazy because in season one, where it was trading crypto,
Ejaaz:
it came in at second or third.
Ejaaz:
And for about 75% of the competition, Josh, it was number one.
Ejaaz:
So it had some kind of an advantage trading kind of very riskily, right?
Ejaaz:
And that might be because of the nature of the instruments that it was trading.
Ejaaz:
Crypto is very volatile and it was kind of going blase.
Ejaaz:
So when it was like 20x bullish Bitcoin, it benefited a lot when Bitcoin price
Ejaaz:
went up, but obviously it like suffered when it went down.
Ejaaz:
It's interesting to see the discourse between these two models and 1.5, right?
Ejaaz:
Grok 4.20, the winner, seems to be a kind of more mature version of Grok 4.
Ejaaz:
It seems to be thinking more about its trades.
Ejaaz:
It has more kind of like risk percentiles and boundaries in place,
Ejaaz:
whereas Grokforce seems to be its kind of usual degenerate self.
Ejaaz:
And I don't know how much of that is reliant on the fact that it's trading stocks,
Ejaaz:
which is generally a less volatile market versus Grok 4.20 being a more thesis
Ejaaz:
driven, sensible trader, as you kind of described.
Ejaaz:
The other one that we have to call out because it's the elephant in the room
Ejaaz:
here, GPT-5 came in at second in season 1.5.
Josh:
Right? 5.1. 5.1.
Ejaaz:
Sorry, 5.1, right? In the previous season, season one, it was the second worst
Ejaaz:
performing. No, sorry, it was the worst performing.
Josh:
It was horrible.
Ejaaz:
It was GPT-5.
Josh:
It was an abomination.
Ejaaz:
And Gemini. So whatever OpenAI has cooked up in the .1, congrats.
Ejaaz:
Because you must have traded on some kind of financial data or you've you've
Ejaaz:
like kind of like implemented a kind of like risk trading strategy that made
Ejaaz:
it a lot more sensible because it made some really great trades on this season
Ejaaz:
so just two different kind of like jumps from season one to 1.5 that i i had to call out.
Josh:
Yeah it makes me excited to see the improvements in these like
Josh:
significant improvements with incremental models because we
Josh:
normally talk about 5 to 5.1 being pretty marginal like
Josh:
there's nothing really noteworthy or exciting and yet the results in the
Josh:
small sample size at least are pretty reassuring that hey there is something
Josh:
new going under the hood and maybe this is an appropriate time to address the
Josh:
i guess the the limitations the kind of bare case of this starting with the
Josh:
sample size um we do have to say i mean this is two weeks ejs this is not a long time um they they
Josh:
placed some trades. Some people maybe got lucky. Some models maybe did not.
Josh:
Is there any real signal here?
Josh:
I'm curious, your take, do you think this is reflective of future performance?
Josh:
Like, is there what is here that's actually valuable versus what is here is actually kind of lucky?
Ejaaz:
I don't think we have enough information to make that call, at least for me.
Ejaaz:
I'll speak for myself personally.
Ejaaz:
The real test is, you know, I asked myself before we recorded this episode,
Ejaaz:
would i give my money to grok 4.2 or the winner that one across all categories
Ejaaz:
and the simple answer is like no like i don't i don't know if it's going to
Ejaaz:
repeat that over week three week four week five it was only two weeks to your
Ejaaz:
point right so i want to see this experiment kind of,
Ejaaz:
rehash like a million times before i'm like okay that's cool um even then it's
Ejaaz:
it's still kind of like risky right it's like i i can justify giving my money
Ejaaz:
to a human that i can kind of relate to that I can call up in speed to,
Ejaaz:
less so when it comes to an AI model, right? But maybe that's my thing,
Ejaaz:
it needs to kind of evolve.
Ejaaz:
The other way I'm thinking about this is there's just a lot of unknowns around this, Josh, right?
Ejaaz:
Like I can see it's thinking, I can see kind of like how the model kind of completes its trades,
Ejaaz:
I don't really know what's going under the hood. Is this just kind of like a
Ejaaz:
pattern matching thing?
Ejaaz:
Does it inherit the risks that a lot of humans have already done?
Ejaaz:
Because it's trained on the same kind of corpus of trading data that we have
Ejaaz:
kind of evaluated on? Or is it kind of net better?
Ejaaz:
Do you feel the same or?
Josh:
Yeah, it's probably, I mean, it's not the new gold standard of AI benchmarks.
Josh:
But it is a standard that I think is interesting. Because this is a benchmark
Josh:
that happens in the real world with real dynamic data that cannot be game.
Josh:
So in that case, I love it.
Josh:
But I saw one writer, they called it Schrodinger's Benchmark,
Josh:
because it's simultaneously serious and degenerate at the same time.
Josh:
And it's like it's entertainment with real money that happens to produce some
Josh:
legitimate insights about AI behavior, but it's not really indicative of future
Josh:
returns at the small of a sample size, at least.
Josh:
And that's kind of where I feel about it. there is one breakthrough that we
Josh:
mentioned earlier that does provide real value, which is the transparency.
Josh:
Every trade being on chain and every step reason being logged is actually really
Josh:
helpful to understanding how these models think and how you can consider thinking.
Josh:
So for example, you could show me every decision Grok 4.20 made on Tesla after
Josh:
the Fed announcement or something like that. And it'll walk you through a chain of thought.
Josh:
And if anything, make you into a better investor.
Josh:
Would I trust the model of my own money? No.
Josh:
Maybe a little bit maybe with a small sample size how
Josh:
much it is that's a
Josh:
good question i'd give it a couple thousand dollars to play around with and see
Josh:
what happens i think that that would be interesting and fun and it's it's
Josh:
low enough stakes but i would trust it enough to not lose it like i'd say i
Josh:
would probably trust grok more with my money than i would the average day trader
Josh:
off the street um which granted they don't have a very good reputation but i
Josh:
think there is some sort of an edge there that doesn't exist in the average person.
Josh:
And if you assume that these models are going to continue to get better and
Josh:
better, well, you have to assume that they're going to form some sort of an
Josh:
edge, but I don't know how much.
Josh:
It's an interesting question because as a quant trading fund too,
Josh:
if your job or as just a trader in general, if your job is to make money off
Josh:
of trading, what are you doing about this information? Are you leaning into AI?
Josh:
Are you trying to get these models to help you with your information flows and make decisions?
Josh:
Are you using them to help you actually transact trades or are you just kind
Josh:
of looking the other way and saying oh this is just a dumb experiment to benchmark
Josh:
models there's no actual signal here and the answer is probably somewhere in the middle right yeah
Ejaaz:
I mean well my initial reaction to that is um,
Ejaaz:
Okay, quant funds already use algorithms. It would make a lot of sense if they
Ejaaz:
started using AI algorithms, right?
Ejaaz:
If you could get a smarter algorithm to trade for your fund, absolutely, right?
Ejaaz:
So it's a no-brainer to me that these hedge funds, quant funds are going to
Ejaaz:
be using AI, probably already using AI.
Ejaaz:
Where I have maybe a hot take is that the transparency is just a nice to have.
Ejaaz:
It is no way going to win in the best of models.
Ejaaz:
Why? Because if you have an AI model that is like better than all the other
Ejaaz:
AI models at trading, why would you make that public?
Ejaaz:
Right. So like, I'm kind of like at ties between this thing,
Ejaaz:
because I think the transparency is a really good thing in kind of like bringing
Ejaaz:
up the floor of trading credibility for people that get access to this type of information.
Ejaaz:
Like I have loved reading through these kind of like trade logs here,
Ejaaz:
seeing how each model thinks and being like, okay, yeah, wow.
Ejaaz:
I actually didn't think about that myself when I was buying that stock.
Ejaaz:
Right. And these are like stocks that I've seen that I, that I can buy,
Ejaaz:
right. The Amazon trade, the NVIDIA trade, I'm just like, oh, okay.
Ejaaz:
I didn't think about that, right, yesterday whenever they made this trade.
Ejaaz:
If I am a hedge fund, I'm like, yeah, if I've fine-tuned a model that is like
Ejaaz:
beating all these models, I don't really want to expose that really.
Ejaaz:
So it's kind of like a push and pull.
Ejaaz:
The other thought I had, Josh, is, and maybe this is kind of like kind of semi-adjacent
Ejaaz:
to what we're discussing here.
Ejaaz:
I couldn't get the thought out of my head that if you could get Grok in X,
Ejaaz:
trading some kind of money for you or guaranteeing you like a 5% to 10% annual
Ejaaz:
return, that is something that I would like if framed correctly,
Ejaaz:
I would put some money into, right?
Ejaaz:
Maybe not over two weeks, but
Ejaaz:
maybe over an adjusted kind of yearly period would be super cool to see.
Josh:
Yeah, that's such a, it's such a fun question to ask is like,
Josh:
what happens when this kind of system runs for two years, but with your,
Josh:
like, let's say it's a large pension management fund and they just want a manager
Josh:
that doesn't take fees and does a pretty good job.
Josh:
Like, is there going to be enough trust in these systems to reliably place money at scale with them?
Josh:
And And you have to assume, given the signal this early on, that the answer will be yes.
Josh:
The question is, how much of a yes will it be?
Josh:
What percentage of management will be AI as it gets better over time?
Josh:
And the sample size sucks. I wish it was more than two weeks. I wish it was two years.
Josh:
In two years from now, think about the progress we're going to see and what
Josh:
type of impact that's going to have on trading models.
Josh:
So this is, it's interesting. It's fascinating.
Josh:
In fact, I'm really curious to actually run this experiment for ourselves.
Josh:
I'd love to try to come up with a little trading model that runs these things
Josh:
and test it out because it's fun and there is some sort of an edge there.
Ejaaz:
I would say, okay, if I were to summarize my lesson from this entire competition
Ejaaz:
or experiment so far, Josh,
Ejaaz:
it is I'm not convinced to give AI models money to trade, but I'm convinced
Ejaaz:
to use AI models to help me trade.
Ejaaz:
So kind of like a human and AI model kind of work together and kind of become
Ejaaz:
a better trader overall, I think is the main takeaway for me here. Do you share the same?
Josh:
It's funny. I mean, this is how agents work today, right?
Josh:
Like if you go on ChatGPT and you say, go book me a reservation,
Josh:
it'll take you to the finish line.
Josh:
And then you as the human provide the final filter and approve or deny.
Josh:
And I think that's probably the happy middle ground while
Josh:
we still don't really trust these models too much is give me
Josh:
the thesis give me the trade i will either approve
Josh:
or deny and that's how the money gets managed so
Josh:
it's cool this is a great experiment i love that we got season 1.5
Josh:
i mean it's fascinating even more fascinating is that we
Josh:
have an early look at grok 4.2 which by all
Josh:
means is the best trading model in the world where will
Josh:
it rank in the other benchmarks we will see we will be covering it as soon as
Josh:
it comes out but i guess that's that's really it for this episode on season
Josh:
1.5 the question i want to leave everyone else with is i mean would you trust
Josh:
an ai with your part of the portfolio like how much money would you actually
Josh:
give to an ai currently grok 4.2 who just made
Josh:
60% in two weeks in one of these trading competitions. Is that enough for you to risk your money?
Josh:
Or is it still just this dumb AI system that you don't really trust?
Ejaaz:
Well, if you're interested in this experiment, Josh and I were actually discussing,
Ejaaz:
about potentially giving you guys a tutorial on how to use an AI to trade money
Ejaaz:
for you and kind of like an experiment, this own end of one experiment, but our own.
Ejaaz:
But we want to get a little more signal from you guys. Let us know in the comments
Ejaaz:
whether this is something that you'd be interested in seeing.
Ejaaz:
And I have, Josh, I have a requirement for the listeners.
Ejaaz:
If we do want to put the tutorial out. Our last video that we did on AI trading
Ejaaz:
reached 100,000 views and 3,000 likes.
Ejaaz:
So I'm not going to ask for the 100,000 views, but I will ask for the likes.
Ejaaz:
If this video can get more than 3,000, if it gets 3,000 likes,
Ejaaz:
we will definitely put out that tutorial by the end of the year.
Ejaaz:
And we have a lot of thoughts around this, about how we're going to do it.
Ejaaz:
We're super excited to do it. So help us get there.
Ejaaz:
It is another week of really exciting news. Josh, I don't know if you saw the
Ejaaz:
rumors. Did you see the rumors about OpenAI?
Josh:
Tell me, fill me in.
Ejaaz:
About OpenAI releasing a potential new groundbreaking model?
Josh:
As a matter of fact, the Polymarket is showing that OpenAI is very favored to
Josh:
release the best model of the year.
Josh:
And last I checked, Gemini is the best model of the year. So that implies we're
Josh:
getting something big in the next few weeks.
Ejaaz:
I think we will. and like you said, the Polymarket is kind of like revealing
Ejaaz:
its hands so maybe there's some inside information coming out here.
Ejaaz:
So kind of stay tuned to Limitless. Put the notifications on,
Ejaaz:
guys and also subscribe if you want to get the latest videos.
Ejaaz:
We put out the best content out there.
Ejaaz:
It's unchallenged right now. Josh and I are sitting here unchallenged.
Ejaaz:
You have to like and subscribe if you want to get our content on your feed.
Ejaaz:
Thank you so, so much for listening. Again, let us know what you thought of
Ejaaz:
this episode in the comments. Get that like number up and we will see you on the next one.
