Meta’s New AI Can Predict Your Emotions Better Than You
Josh:
[0:03] Imagine if you can predict how someone's brain lights up just by knowing what movie they're watching well meta's new ai model is kind of doing exactly that so in this episode we're going to dive into how meta's ai team just won a big brain modeling challenge by building a model that can literally take a movie's video audio and text and then predict exactly how your brain would respond if you watched it it's not mind reading but it's close and it's a fascinating glimpse into how ai learning
Josh:
[0:28] to understand our brains from the outside looking in. So this is really interesting. I read the report. I went through the paper. Ejaz, it's pretty cool. Walk us through it. What happened here?
Ejaaz:
[0:39] Yeah. So this was the kind of equivalent of a science competition, but for like the best tech companies in the world and Meta just won it. The invention that they made is, as you said, it's kind of like this brain scan AI model. But what's cool about it is it can predict whether you are going to like a movie or a video, whether you're going to hate it. And it's not just the entire video, it's different scenes of the video, it's maybe a certain actor or a certain kind of illusion or whatever that might be. And the reason why this is so cool is just imagine a world where you could get like kind of personalized content.
Josh:
[1:21] Imagine like a
Ejaaz:
[1:21] Personalized Netflix that you watch every day, Josh, but instead of you kind of being bored by the same plot lines or the same twists, you are just constantly surprised. But if your friend came over and watched the exact same show, they would see something completely different, right? What I thought was like astounding by this is not just like the effects that it has on content, but also how the model works, right? So it's a 1 billion parameter model. And if anyone has ever listened to this show before, you know that that is minuscule compared to any of the models that we typically talk about. We typically talk about like trillion parameter models. So 1 billion is greatly small, but what is cool about this is it's multimodal. So we're not just talking about like an LLM here that kind of ingests words and characters and spits out words to you itself. It's ingesting images, it's playing through video, it's ingesting audio, and it's compiling all of that in a really tiny model and figuring out what you're going to think of, how you're going to react to things. I just think that's awesome.
Josh:
[2:26] Yeah. Okay. So let's explain what's happening here. So they won the global competition. It's called Algonauts. And there was 260 teams that For people who aren't familiar,
Ejaaz:
[2:37] Like I was, Algonauts.
Josh:
[2:38] Apparently, there is an Olympics for brain modeling, which I didn't know, and that's what this Algonauts competition is like. And the challenge was to predict the real fMRI brain responses from what someone is seeing, hearing, and reading on screen. So the winning model, it was called TRIBE, short for Trimodal Brain Encoder. We talk about multimodality. We'll get into this in a sec.
Josh:
[2:58] It topped the leaderboards with a correlation score of 0.2146, which sounds small. No idea what that means. In brain science, like, that's gold. That's what it takes. I'm not sure. So basically what it was, it's like, it's guessing how your friend will react to the movie scene. When will they laugh? When will they tear up? When will they get confused? Except instead of emotions, we're talking about these thousands of tiny little patterns in your brain's activity.
Josh:
[3:20] So quickly getting into how this works, TRIBE, which stands for TRI-MODAL, it takes in three data streams. So you have video, audio, and text. The video part, it uses Meta's model called V-J-E-P-A-2. some weird the model names are always so bad but basically this this fancy model that meta makes it's used to understand visual details so it recognizes faces it recognizes movement in the scenes colors and the different types of scenes and it kind of understands what implications each of those traits have on your brain stimulation and then there's a second part to this which is audio it uses this thing called wave to wave vec2 another just like super weird model it doesn't matter basically what it is is it interprets the tone it interprets pitch and sound patterns so things like music swelling or explosions or any sound effects it analyzes the direct impact on that audio stimulus on your brain and then the third one is text and that uses a model that we're very familiar with called llama 3.2 that's their open source model actually so anybody can go and use that and what it does is it processes the dialogue and the captions and it it takes the implications of what that means and then how that would impact your brain so you could kind of think of it like three detectives they're each kind of investigating from a different angle we have the site and sounds and then we have the script and they're pulling all the clues together before like making a call on what they think you will feel
Josh:
[4:39] And they even trained it to handle missing data. So if there's no transcript, for example, it can still predict brain activity just by using sound and vision
Josh:
[4:46] and by transcoding the voice in real time to create the transcript and feed that into the model. So it's this really impressive, like, trimodal thing that they managed to pull off. And I mean, of course, Meta wins. I would imagine of the 260 teams, they're the most well capitalized by, like, a couple orders of magnitude. But it's still really impressive what they managed to pull off. So what does this mean? To me, this is really cool. i am a big fan of the collapsing of the difference or the distance between the brain and the actual input and output of a computer and if we can start to not read it because this isn't mind reading but if we could start to anticipate it if we can start to to understand the impulses and i guess transcribe that into data that to me seems super interesting because we're going to talk about a wrist device that they announced a little earlier but the first time i saw this really was with the Apple watch where just by kind of like the action of moving your fingers will actually trigger an impulse on your device. And it feels like it's reading your mind, even though it's not really, it's just taking a good guess. And I think when it comes to how we interact with computers, how it comes to how we engage with content, the fact that a model can predict what type of content is going to most likely stimulate us.
Josh:
[6:02] It seems like something that is really exciting in the sense that it can create the most compelling content in the world and also really frightening in the sense that, well, if you have TikTok on this and it understands all the impulses from your brain exactly how to optimize it for you, well, you can get some pretty amazingly addictive content. So to me, this is exciting for like two reasons. It's like you can get great content and you can now engage with computers even faster because they can anticipate your actions. But also it's like hey we can design like pretty enticing addictive experiences because we actually know exactly how your brain works down to the audio wave so that's kind of my take on on the implications of this did you have any further ones yeah
Ejaaz:
[6:45] I think of this as a double-edged sword if i were to guess where the pinnacle
Ejaaz:
[6:52] of all this technology is gonna end up it's gonna be a really slick brain computer interface. So what I mean by that is we're trying to replicate human intelligence. We can't just do that by words. We need vision. We need audio. We need feeling. We need all of these kinds of things. So we're kind of like developing all these different kinds of devices. We've got robotics. We've got different kinds of AI hardware, robo taxis. We've got different screens, cell phones, VR goggles, everything to try and emulate and simulate human intelligence. So with this new development, it's really optimistic in the sense that it's helping us get to that stage that I just described. But the flip side of it is this could be really bad for us as well. Do you remember a few years ago, Josh, there was this documentary on Netflix called The Social Dilemma? I think it was called The Social Dilemma.
Josh:
[7:42] Oh, yeah. Yeah, I do, actually. Yeah. It went viral for one singular reason.
Ejaaz:
[7:48] It unpacked and revealed how Facebook's algorithm worked, Facebook and Instagram's algorithm. And it basically described that they knew everything about you. They knew what you were going to buy before you even bought it. They knew what you were going to like before you were going to like it. They knew the friends that you were going to make before you'd even met them. That's insane. And I think that this is that on steroids, right? Because it's now going to apply to content that doesn't even exist. I mean, think about the applications here, right? Like you could, in the optimistic sense, be a movie director and you're like, I don't know whether this script is going to be cool. Will people like this scene? Maybe, maybe not. But let me just run it through this brain scanner or this brain simulator rather and test it on maybe a hundred subjects and see whether it's appealing to the demographic or audience that we're pitching for. That's really optimistic. But on the other side, if I am a meta shareholder, and again, I'm LARPing as a meta shareholder here, right? And they're like, well, I want retention to go up. I want user base to go up. The best way to do that is to keep people's eyes on the screens on our apps. So if we can perfectly tailor content, if we can be the producers of content as well, right? We don't need to rely on users. Heck, we'll just use our AI models to create it ourselves and write the scripts. We then own the entire stack and the users and the retention and we keep making
Ejaaz:
[9:11] money. That's my doom thesis. But yeah, that was my take.
Josh:
[9:13] That checks out. Yeah. And we're getting like interesting takes from other people too, because there's the neuroscience camp, which this is interesting in because, I mean, it helps map which brain areas respond to language, music, visuals, and how they integrate over time. So for people who are in the brain world, this is fascinating. And AI has just unlocked a new field to look into. And then for AI researchers,
Josh:
[9:32] Well, it shows the way foundation models process information isn't random. Like it actually lines up with the way that human brains do it and at least in some higher level reasons which which implies that like maybe we are not so different from a llm than we thought like maybe we are actually just predicting next words if if we could have an ai model predict our feelings pretty consistently then like well you could have an interesting conversation about agi too is is like what really is the difference between the token prediction versus how we our brains predicting so that's just this weird edge case but yeah and then there's this another other like really cool area that i was thinking of because it was education when you're learning things when you're being taught things a certain lessons certain ideas create more of a cognitive load on your brain than others and if you're aware of exactly the amount of load that you can deliver to a brain then you can actually optimize lessons optimize education even optimize working for the maximum that your brain is comfortable handling. And it can actually detect the impulses and the stimulus as you go and kind of figure out like the best way to you can kind of optimize yourself so like if you're doing let's say if you're doing like a math lesson and it it has like you know a bunch of things that you need to a bunch of problems you need to solve and knows exactly how far it can push you before you reach your breaking point and that to me that's like interesting it
Ejaaz:
[10:56] Knows not to feed me educational tiktoks when i'm coming back from a night out basically which i think is is a major unlock yeah the education thing is actually great i didn't think about that i kind of immediately thought of Chinese TikTok. You know how they say like Chinese TikTok is super educational, informative, and productive for their audience versus Western TikTok, where it's all just kind of like slop dances and entertainment. I kind of think about that immediately. I kind of think about how this will apply to news and media as well, right? I wonder if, you know, over the last decade, we've seen click, this rise of clickbaity titles where, you know, they're trying to figure out, you know, dying media corporations are trying to figure out ways to keep users, get people to join their subscriptions and stuff like that, whether that takes another step up in terms of forming narratives and what that means for kind of like truth seeking, platforms, like maybe like X or that's how they advertise themselves at least versus kind of like traditional media as well. But I was also thinking, Josh, this isn't the first kind of hardware push that Meta's made, right? Didn't they come up with a neural wristband literally a month ago that can act as some kind of interface? So you can make certain gestures and control an interface?
Josh:
[12:12] Yeah. So they have this hardware device that will actually anticipate and understand the movements in your fingers through a wrist device. And it's pretty fascinating. They refer to this as the first high bandwidth, generic, non-invasive neuromotor interface, which is a long way of saying it can kind of understand the muscle activity
Josh:
[12:33] That happens in your hands without an implant. And this is kind of the first time that we've seen this at the level of precision that we have. So in the past, we've had, I mean, even the Apple Watch example that I described earlier, you can kind of tap your fingers and it recognizes the muscles flexing in your arm. This will actually recognize gestures, recognize discrete inputs, recognize handwriting. And what we're seeing on screen now is a person writing by hand and it's actually automatically dictating the words into text. and it's good up to 21 words per minute with an additional 16% boost when it's personalized to your handwriting. So like why this matters? Well, it's not the first interface of its kind,
Josh:
[13:12] But the wristband is easy to wear, reliable, fast, and it feels like a leap forward in how we engage in human interfaces. So, I mean, in the past, we've talked about the Apple Vision Pro and just the increased adoption of spatial interactions with computing where now we have virtual worlds, we have virtual reality, we're talking about goggles. If you are able to take away the need for a screen and using two fingers and can just place these sensors on your wrists that are much less invasive much more natural and also much more rich in terms of your movements like when you're interacting with the screen you're you're tapping a single point or maybe a multi-point for multi-touch but with the hand thing you have all of the like free range motion of your hands you have the angles you have gravity you have gyroscopes you can really personalize
Josh:
[13:58] how you interact with these machines. And this seems to be the trend with Meta, where they're just kind of trying to understand more of how the human brain works, how the human body works, and how we can merge that with this next generation of wearable computers.
Ejaaz:
[14:12] And it's not just a trend with Meta, right? We're seeing these types of developments through a number of different companies. Originally, we had Elon Musk with Neuralink, which is basically putting a chip inside your brain and therefore whatever you think and look at, you can kind of like access in a simple computer-like way. And then this week, we got this announcement that Sam Altman via OpenAI is funding a competitor, $250 million into this company called Merge Labs, which is basically building a Neuralink competitor. So again, another chip in the brain. And then I kind of think about how Meta's making this move with their VR glasses and then also with their Ray-Ban glasses, so augmented reality versus VR, this wristband, this now brain simulator. And I can't help but think that there is this trend towards, I don't know whether it's.
Josh:
[15:07] Better to call it AI healthcare,
Ejaaz:
[15:09] Or whether it's AI bioengineering. I kind of think bioengineering makes more sense because it's like a means to an end to try and like create this new kind of metaverse type world, if I dare use that word. But I think it's super cool. And even actually in OpenAI's announcement last
Ejaaz:
[15:26] week of GPT-5, they spent like 20 minutes talking about healthcare. Like, I've just got a clip up here where they interviewed this lady who has gone through, you know a number of different illnesses but she used chat gpt to kind of like diagnose her and offer her different paths towards cures and treatments and that was just from like a word context right but imagine if you had an array of devices or chips that could just kind of like read you in real time just like it does a computer can access your memory can see kind of like the experiences that you've been through can feel the trauma through your neurons you know do a full body scan like you see in like these futuristic movies, that would be pretty insane, right? To just kind of like have this curated cure come to you. So I think we're headed towards this kind of world where I think AI is done really well in with words, basically, and in chatbots. It's starting to appear in like social consumer apps as well. But the next major leap, aside from math and coding, which we've spoken about before on the show, is going to be general science. And I think that's super exciting.
Josh:
[16:29] Yeah that seems super cool and there's an important distinction between a lot of these things so actually if you wouldn't mind going back to the last post with sam altman announcing the the raise yeah so 250 million dollars into merge labs which we're assuming this is the neural link competitor this is the brain machine interface competition to neural link and what we've seen from a few others who are trying this is actually very different than what we've described earlier today with meta and their the algonauts competition with their tribe model So the tribe model, the way that it works is it's read only. So it will read your brain inputs. It will understand how different sensory affects that. And then it can predict results.
Josh:
[17:05] How you're going to react to certain things with tribe and with neural link it has the right function as well so not only will it read your thoughts or anticipate your thoughts but it will actually allow you to to do things with it so it can control parts of your body without your own inputs actually being responsible for for causing it so with with meta it is anticipatory but it is not actually reading your brain with neural link and with this this new company from open ai and sam altman merge it is actually reading directly from your brain and then directly allowing you to create input and output so you can imagine what we've talked about today with tribe as being the very early version of this trend that you're describing each as which is like moving more towards human-based compute and then the the next step up is actually reading and writing from the brain and i've totally agree with with your take this is this is where it is going the health care space in general the neural space in general it it all has moved very slowly as
Josh:
[18:06] Of the last couple decades there really hasn't been much progress it hasn't been that good we it feels so dumb that in order to do things like solving cancer we just radiate our bodies and destroy everything else with it it's so imprecise it's so seems like it's very sloppy it's not it's not good and we haven't had progress in this space for a very long time and not so the fault of anyone really i would imagine these are difficult challenges, but we have this new technology that can enable us to solve these things and to bring these computers closer to our biological bodies.
Josh:
[18:37] And I think I fully agree with you that this trend is here to stay. This is an important one. It's cool to see people like Sam and OpenAI really stressing how important it is to them. And I mean, every time I hear Sam in an interview, he's talking about how the thing he's most personally excited about is net new knowledge on the frontier of bioengineering and healthcare and solving human problems. Because I think that's where we really get this huge level up of a perceived level up. Like we talk about this a lot where we feel like we're moving very quickly, but the world around us, like it takes these like big leaping points to catch up. And hopefully that's gonna be one of those things where one of these days you'll go to a doctor and they'll put a little thing on your head and they'll put this little thing in your arm and it'll be able to detect all these sensory inputs
Josh:
[19:21] and understand what's wrong with you and be able to help you in a much more precise way. So this directionally seems like, yeah, the right trend.
Ejaaz:
[19:27] Yeah, I actually hope we take it a step further and, you know, you don't even need to go to the doctor in the first place. Like you just have a set of personalized AI health hardware devices or whatever you want to call it. Maybe it's just a single chip that could just read you 24-7, anticipate diseases or illnesses before it happens and kind of have your prescription mailed to your door, right? I kind of think about the old school way 10 years ago where you have to kind of sit in line at the waiting room at a hospital or a local GP office and kind of maybe wait hours, maybe you're delayed, maybe you have to go away and come back another day for a follow-up. And instead we kind of head towards this kind of real-time AI healthcare. And I don't know about you, Josh, but I am loving all the competition, all the arguments between these billionaires online because net-net, it results in a great experience for the consumer, I think. And if these guys are in a fierce competition spending tens to hundreds of billions of dollars every quarter now, not even every year, on kind of big leaps like this, I am all for it.
Josh:
[20:35] That's the amazing thing, right? Is like all of the spending we're talking about, all of these problems that these people are having to solve, all of the drama that is happening on a day-to-day basis, it's all to benefit the end user. It's all to benefit us. And for the low cost in most cases of $20 a month to get access to all of this chaotic fighting intelligence, like all direct, the smartest people in the world have been working on the single thing. And we get access to it for $20 a month. And every week it gets a little bit better and a little bit more powerful. And we get a little bit more use cases. And the real winner is just the armchair experts over here just sitting down watching it all unfold and paying a monthly subscription to access it to benefit our life. And on the point of the the healthcare thing coming to you i that that makes a lot more sense than actually going to a doctor right because apple has been trying this to varying degrees of success with the watch where they could do ekgs now and whoop recently did this and they detect your blood pressure now and they could detect your temperature and and we've seen this up to an extent but i think what met is working on what we just showed a little bit earlier with the actual like neural sensory impulse thing that's this whole next level of of devices and it's funny to see it coming out of meta's lab and maybe they are going to be the the hardware company that is is going to push us there because we have open ai and they have their like device
Josh:
[21:56] Plus suite of devices that we're not really sure what they are they've been very vague about that and then we have apple who has a watch a phone a laptop i mean they have the vision pro they're probably going to go with glasses and then we have meta who has the glasses already they have this wrist technology and
Josh:
[22:11] I think what Meta does well that other labs don't or other companies don't is they actually just share the research as they go. So I'm sure Apple probably did something similar to this. I'm sure maybe OpenAI or Google or someone who has like big hardware chops has done this. But Meta just shares the research and they get everyone really excited about it. And it's part of the open source ethos that they initially did. We're like, hey, we're just going to share our work. Tell us what you think. Tell us where we can improve. And they're just kind of doing this thing in public.
Ejaaz:
[22:36] I just realized, Josh, you reminded me, we completely forgot to mention Google, who have been like the main company putting out all the amazing AI medical stuff. They were the first to release an AI model that could predict and generate protein structures for cures for antibodies. They've been investing exactly they've been investing hundreds of millions of dollars into this kind of like ai science sector so you know shout out google we're actually interviewing one of the heads of ai at google logan so i'm we are so excited for that interview and keep an eye out for that but yeah you know i've got nothing else to say except here's to living till the ripe old age of 150 and being of sound mind and body knock on wood 150.
Josh:
[23:19] I feel like those are rookie numbers but i guess what how how late how long do you think you'll live just random number do you think it's 150
Ejaaz:
[23:26] Depends on the quality of life like what are we talking about here like am i like an old man in a rocking chair or like how long am i in my 30s how long
Ejaaz:
[23:34] am i in my my healthy 30s i don't know.
Josh:
[23:37] Brian brian johnson's been there for a long time now and he's he's patient number one so maybe
Ejaaz:
[23:41] Maybe he hasn't even been there he reversed back he reversed.
Josh:
[23:44] Himself back yeah so So it's possible. It's going to be really exciting. There's a world in which if you live for the next, if you could keep yourself healthy for even the next decade, like there is a high probability that you will continue to feel healthy for much longer than you will. And I guess in terms of can we maybe end on predictions, like what company we're most excited about to release products AI, computer, human interface hardware? Do you think there's a winner that you see early on?
Ejaaz:
[24:14] I don't think there's a clear winner now because there's no real consumer device or experience that we can kind of see at scale right now. The company that's closest to it, I would say, is Neuralink. So Elon Musk's company, who has been trialing these chips in your brain for a number of different patients right now. And it's crazy. People who have had a completely disabled body or certain parts of their bodies can now control computers and are tweeting from their X accounts, right? And I'm liking that. And I'm like, wait, you can't move your fingers? That's insane, right? But I don't think there's a clear winner. If I was to go with the dark horse, maybe not so dark horse, I don't think it's Meta. I think Meta is going to use all this amazing technology to prime their social media. I'll go and that's me being a doomer.
Ejaaz:
[25:04] But I think it's going to be Google. I think it's going to be Google.
Josh:
[25:07] Okay, that's a good bet.
Ejaaz:
[25:08] I'd love to throw shade on Google because I'm like, ah, they're a big corp. They can't innovate. I remember their first AI model being completely inaccurate. And so I was like, I kind of left a bad tint in my mouth, but I think they're killing it. With the Gemini model, with all this investment in science, I'd say Google.
Josh:
[25:24] Yeah, okay. That sounds right. Yeah, Neuralink is not even in the same conversation because they're kind of like the device to end all devices. They are the final frontier of device. There is this intermediary layer until we get there where you do have like the glasses you have the wrist strap you have like the the things on your body before it's just literally your brain so i agree neural link on the brain thing i sam altman best of luck with your new company you got you got a pretty steep hill that climbs to catch up with them but in terms of that that intermediary hardware like the before we get the device that ends all devices i i kind of I agree with you on Google. I think I have the same take. I want it to be Apple because they have the best design and I think the best ecosystem. It's just not going to be the case.
Josh:
[26:10] Meta has never really delivered a hardware product at scale that I've enjoyed. So it would take a like zero to one moment for them to start releasing products that are actually great, which they're clearly trying. But Google, man, Google's just, they're dialed. You mentioned that they did
Josh:
[26:24] have this problem earlier with the diversity thing and it was creating bad images and stuff. But the the larry and sergey have come back they are working on the company they have a dialed it in we have a demis how do you pronounce his name at demis yes a demis yes he is unbelievable at leading the deep mind team in terms of like building these these scientific models and i think that research combined with their hardware production capabilities that we see with like the android line and their laptops and everything i don't know could be a home run maybe we're team google for today we'll see where we stand next week but team google and i think that's that's a wrap on today's episode so ai is is not reading your mind but it is guessing very precisely where your mind is going to be and with a lot of accuracy so it's a really cool progression in where we're headed towards in this kind of convergence between computers and humans and it's just another week in the crazy world of ai so thank you for joining us again on another episode we hope you enjoyed this one and yeah there will be lots more to come if you enjoyed the episode don't forget to like subscribe share it with a friend that helps a lot we oh we flipped the open ai podcast officially we were no
Ejaaz:
[27:32] Way yes on spotify.
Josh:
[27:34] So on the spotify tech charts limitless is now above the open ai podcast to everyone who helped make this happen thank you we are very much indebted we very much appreciate that is epic keep it coming keep it coming we're going to the top so thank you everyone for watching for sharing everything and we will see you guys in the next
Ejaaz:
[27:50] One thank you peace.
Creators and Guests
