Podcast Banner

Podcasts

Paul, Weiss Waking Up With AI

Subliminal Learning in AI

This week on “Paul, Weiss Waking Up With AI,” Anna Gressel explores groundbreaking research on “subliminal learning” in AI, revealing how LLMs can inherit hidden behavioral traits—including harmful biases—from seemingly benign data. 

Stream here or subscribe on your
preferred podcast app:

Episode Transcript

Anna Gressel: Hello, everyone, and welcome back to “Paul, Weiss Waking Up With AI.” I’m Anna Gressel. Unfortunately, we don’t have Katherine here with us today, so it is another solo episode. I miss her too. But we are going to be discussing a really, super interesting paper that just came out on what is called, or what the researchers are terming, “subliminal learning” in AI. And in that paper, researchers demonstrated that AI models might be picking up on hidden traits from each other in a manner that is not at all visible or apparent on the surface to humans. Super important paper. I know Katherine is probably devastated to miss this discussion, it’s right up her alley. But for people who are looking for it, I want to give you the title of the paper. It’s definitely worth checking out. You can look at the blog post, which is a lot shorter. It’s called “Subliminal Learning: Language Models Transmit Behavioral Traits Via Hidden Signals and Data.” I mean, that title says a lot about what this is, but we’re going to dive deeper into it. And it really has some interesting and concerning implications we’ll talk about towards the end of the episode. But I’ll start with just kind of giving you an overview of what the research was really about.

But let’s start with even a more foundational point: what do we mean by subliminal learning? I know all the folks who watch a lot of sci-fi on our listener base are going to be like, “obviously I know what subliminal messaging is, I watched all the movies with it.” But just to kind of level set us, subliminal learning or subliminal messaging, kind of in pop culture, is a term that refers to the process of receiving information that’s presented essentially below the threshold of human conscious awareness. And that might include stimuli that are perceived, they’re not consciously perceived, but they’re perceived. And so they can influence behavior without people knowing. And that is, you know, it’s the premise of a lot of different, movies out there and appears in a lot of different pop culture references. I mean, I think we all, maybe not we all,—I remember a lot of pop culture references about playing records backwards or a cassette backwards and secret messages being embedded in there. And people talk about this sometimes in advertising. So that’s subliminal messaging. And in the human context, it could be brief flashes of text or subtle sounds and words. And this is different from conscious learning, where we as humans know we’re being presented with information and we actively process it. We understand that we’re doing something that’s part of a learning process. So that’s the human side.