Banner

Podcasts

Paul, Weiss Waking Up With AI

Jailbreaking: Safety Issues for AI

Katherine and Anna provide a primer on jailbreaking in the generative AI context, a subject that's top of mind for security researchers and malicious actors alike.

Stream here or subscribe on your
preferred podcast app:

Episode Transcript

Katherine Forrest: Hey, good morning, everyone and welcome to another episode of “Waking Up With AI,” a Paul, Weiss podcast. I'm Katherine Forrest.

Anna Gressel: And I'm Anna Gressel.

Katherine Forrest: And I, Anna, have my moose mug here. And it's too bad that the audience can't see it because it's really a cute moose mug.

Anna Gressel: It is, I only have a hotel coffee cup, so much less exciting.

Katherine Forrest: All right, well, you and I both do a lot of speaking engagements. In fact, you were doing some of that yesterday. And I often get asked, what's the hottest issue in AI right now? So, as we reach different points in the year, I try to put it in terms of, what was the hot issue for 2024? What was a hot issue for 2023? And actually, the second half of 2023 was MLLMs and the beginning of 2024 was AI agents. And now I say to people 2025 is the year of AI safety. It's all about safety. But it's actually sort of like a lot of things starting before that. It's almost like we have to go on a fiscal year calendar and do it in third quarter of 2024. But AI safety, I think, is the big issue.

Anna Gressel: I agree, I think safety is getting a huge amount of attention, and appropriately so. It's a key issue not only for model, tool, and system developers and deployers, but really for everyone in society.

Katherine Forrest: Yeah, and let's do a little bit of scene setting because we're going to be talking today about AI safety and in particular about something called jailbreaking. But LLMs are increasingly capable. They have really impressive capabilities, many of which are discovered as the models are used. Those are called those emergent capabilities, and they can do a lot of different things now on very sophisticated math, very sophisticated chemistry. They've just taken a huge sort of step change leap in coding very recently. So, they're well beyond doing a fourth graders math problem.

Anna Gressel: Yeah, and I think it's worth noting that very recently, and actually this is yesterday when we recorded it, but of course the episode will be aired in about a week, OpenAI came out with a model it calls o1, not even really with the GPT name anymore, and it's extremely capable.