Podcast Banner

Podcasts

Paul, Weiss Waking Up With AI

GPT-5.2: OpenAI Strikes Back

In their first episode of the New Year, Katherine Forrest and Scott Caravello unpack OpenAI’s release of GPT-5.2, covering its performance on benchmarks like Humanity’s Last Exam and on OpenAI’s own GDPval—an evaluation designed to test the model's ability to match or surpass professionals' performance on real world tasks. Our hosts also examine the model’s sharp drop in hallucinations and break down OpenAI’s discussion of the model’s resistance to prompt injections and how it stacks up under the company’s safety framework.

Stream here or subscribe on your
preferred podcast app:

Episode Transcript

Katherine Forrest: Welcome to today's episode of Paul Weiss Waking Up with AI. I'm Katherine Forrest.

Scott Caravello: And I'm Scott Caravello.

Katherine Forrest: You know, Scott, one day you're going to actually have to say, “welcome to today's episode of Waking Up with AI.” How come I always have to do that?

Scott Caravello: You know, you're the talent here.

Katherine Forrest: Yeah, ah, it's a team effort, baby. So, in any event, you know, next time maybe we'll do it that way. But given the holidays, and I know we're releasing this particular episode a couple of weeks after the holidays, we've entered 2026–2026, right? And I have to say that this podcast started in 2024, not at the beginning of 2024. It started in 2024. But we're into 2026, and that means that we have quite a streak. And, you know, by the by, speaking of streaks and 2026, New Year's Eve, did you watch the ball drop?

Scott Caravello: Absolutely not. I think I'm a bit past that. I'm a big fan of the early bedtime on New Year's Eve and starting the new year off on a good note.

Katherine Forrest: Wait, hold on, you're not that old. I mean, not that age matters, but you're sort of making yourself sound like you've got gray hair. I don't think you have a single gray hair.

Scott Caravello: Oh, I have so much gray hair. I mean, this is the real benefit of recording these podcasts and video conferencing, because you really can't see just how prominent they are.

Katherine Forrest: Well, I actually am in an interactive deepfake, and so I'm just responding to you, but you don't know that I'm not actually real. But anyway, on a more serious note, folks, as we head into the new year, one thing that actually happened in December that we have to go and reach back to talk about is the release of GPT 5.2. We had so many things that we wanted to talk about at the end of the year. We wanted to talk about the executive order on December 11th. We wanted to make good on our promise to talk about ways of detecting interactive deepfakes. And so the release of GPT 5.2 is... it's out there and it's time to talk about it. You know, the release occurred just a week after Altman had declared “Code Red,” it was an announcement to employees, and it was leaked in a memorandum that OpenAI had put out. And it followed the Google release of Gemini 3.0 that we had talked about in a prior podcast. And–you know I love model cards, right?