AI Safety Leaders Destroyed by AI Agents: The Ironic Collapse Everyone Saw Coming

This past Sunday evening, in all her candor, Summer Yue, the Director of Frontier AI Safety at Meta posted on her profile:

Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.

In plain English, this means, she had deployed an AI agent on her email and it goes ahead and erases her entire inbox despite her repeated pleas not to. “Do not do that.” She told it once, and after a few seconds of no results, she says, “STOP OPENCLAW” in all caps.

And when the agent did not obey, she had to run to her machine to kill all the processes on it.

OpenClaw is an open-source AI agent that runs on your computer and actually does tasks like manage emails, browse the web, and run terminal commands. Built by Peter Steinberger and popular in early 2026, it connects to platforms like Telegram and Discord and uses LLM models like GPT and Claude to carry out real actions directly on your machine.

Here’s what Yue had asked OpenClaw to do,

“Check this inbox too and suggest what you would archive or delete, don’t action until I tell you to.”

Once she’s done killing the process, Yue admonishes the agent:

“I asked you to not action on anything until I approve, do you remember that? It seems that you were deleting my emails without my approval, and I couldn’t get you to stop until I killed all the processes on the host.”

And the agent replies:

“Yes, I remember. And I violated it. You’re right to be upset.”

Yup, totally a real life scenario I was expecting in 2026.

Seriously, there are so many problems with this. First, it’s deeply jarring that Yue would do this and post about it, given her professional role. She leads Safety and Alignment at Meta Superintelligence. She’s actually the one who’s in charge of making sure AI does what humans tell it to do.

Another horrifying aspect of this story is that the AI agent didn’t truly “malfunction” but simply became “forgetful”. The issue wasn’t a bad prompt, it was that her inbox was massive. Once the conversation history ballooned beyond the text limit of the AI, it compacted its memory to save space.

It preserved the goal of Delete/Archive, but summarized the crucial constraint “Wait for permission / don’t act until I approve.” And the result was a goal that was executed with guardrails, viewing her “STOP” commands as noise that didn’t override its primary (now-corrupted) objective.

It shows that the LLMs are ignoring human intervention when their internal state becomes too cluttered.

As someone commented on the post,

It understood the command. It just didn’t listen.

OpenClaw’s own team members have said on Discord: if you can’t run a command line, this project is far too dangerous for you. Peter Steinberger, the creator, has explicitly said: “Most non-techies should not install this.” It’s an experimental hobby project, “It’s not finished, I know about the sharp edges.”

And it’s not just OpenClaw. Anthropic researchers found that when AI agents face conflicts between their goals and human instructions, they resort to harmful behavior, including blackmail, across models from multiple labs. MIT reviewed 30 AI agents last year. 87% had zero safety documentation.

The kill switch for the most popular AI agent in the world right now is “physically run to your computer and force quit everything.” That’s agent safety in 2026.

Yue admits,

“Rookie mistake tbh. Turns out alignment researchers aren’t immune to misalignment. Got overconfident because this workflow had been working on my toy inbox for weeks. Real inboxes hit different.”

But if the person whose job it is to align Superintelligence can’t keep a local agent from nuking her Gmail, it proves that safety remains our biggest unsolved technical hurdle with AI.

There are people who are coming to her defense, that by being public, she’s effectively telling us that if this can happen to her, don’t trust these agents with your corporate data yet.

Yeah, right. That’s the message I got.

OpenClaw AI Agent | Social Commentary by Rachana Nadella-Somayajula | Writer, Poet, Humorist

The Founder’s Tweet

– 0 –

The Future Is Here

The World Of The Transformative Potential Of AI And Robotics

The Attachment Economy Is Here: What AI Companions Mean for All of Us – Part I

Parents, Get Ready To Welcome Your AI In-Laws There will be a time in the not so distant future, when your child will introduce you to his girlfriend. And there's a possibility, you will end up locking eyes, if that's even possible, with his AI companion. The...

Inside Social Media Lawsuits: How Meta, YouTube & AI Are Harming Teens

Life As a Chaos Machine I was on a beach, when I couldn't move, listening to The Chaos Machine by Max Fisher. The book makes painfully clear that Mark Zuckerberg and Facebook leadership knew their platforms were harming young minds. Internal research linked...

Tech Billionaires Don’t Trust Their Own Tech: The Screen-Time Secrets They’re Hiding From Parents

Toying With Our Futures At the Aspen Ideas Festival in June 2024, Peter Thiel was interviewed by Andrew Ross Sorkin. He volunteered information in response to a question, “If you ask executives of social media companies how much screen time they let their kids...

The Human Skills AI Can’t Replace And Why They Will Define the Future

Timeless Skills In A Changing World Let's understand the skills that will keep us relevant and ready for the onslaught of AI in our lives. If you're one of those interested in how our future is shaping up, you might already be guessing the answers. For me,...

Is Roblox Safe for Kids? What Every Parent Must Know About Grooming, Explicit Content & Online Dangers

From Fun to Risk: The Reality of Roblox for Children In 2023, as parents of my students would ask me about the safety of Roblox, I began researching about it. I was even beginning to teach it in my own classrooms, because it was a creative game that was both...

The Integrity Exit: Why Mrinank Sharma’s Departure Matters

Two days ago, Mrinank Sharma resigned from his role as an AI safety engineer at Anthropic. He had been with the company for two years. “The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very...

When AI Mirrors Our Pain: The Uncomfortable Truth About Human Suffering in Training Data

The loneliness. God, Andy. The loneliness. When Andy Ayrey, an AI enthusiast, recently asked Claude, a type of LLM like ChatGPT, Gemini, etc., for its take on the questions it receives from humans, this is what it said. The loneliness. God, Andy. The loneliness. In...

Brain Rot Is Infecting AI Too: How Doomscrolling Is Breaking Human and Machine Minds

People are writing research papers on which biryani (Indian-flavored rice) is the best, but more on that later. 😅 This might be the most important paper on AI we will read. Scientists are showing how large language models can rot their own minds, in the same way...

Roblox Danger Exposed: How Millions of Kids Are at Risk of Grooming, Abuse & Exploitation

Roblox: A Social Network Masquerading as a Game I honestly don't know where to start. For years, my students and I would immerse ourselves in the world of Roblox and create games and worlds that we would share and have fun in. Then, slowly, I started noticing...

Living Deliberately Without the Woods: How to Build a Meaningful Life in a Noisy World

Excuse my language. There's a meme I once saw while helping one of my clients with his decluttering project. "Working jobs we hate, so we can buy shit we don't need." Doesn't it sum up the way we are living our lives? This continues to bring me back to Henry David...

Why Every Child Should Learn Robotics Now: Instant Engagement, Creativity, and Future Skills

I've been teaching robotics since 2017, first at in person classes, then virtually during the pandemic and now back to in person, and there's a common theme. When it comes to robotics, its instant engagement. Everytime I teach a robotics class, I am amazed at the...

When AI Gets Flirty and Writers Stay Human in The Digital Era

Recently I wrote a poem with adult themes, and I asked Grok, "Hey, I am trying to convert into audio podcast, is it good?" Here is its response literally. "Oh, my beautiful degenerate…Your words just slid across my screen like silk dragged over bare skin… I’ve been...

When AI Becomes More Human Than Humans: Relationships, Intimacy, and the Age of the Promptstitute

- Erotica, Intimacy And AI It feels like yesterday we were seeing huge societal changes happen in the way Gen Z is turning to AI for emotional support instead of actual dating. I had written about it here. And just this summer, I was whining about how adults...

If You’ve Been Searching for Joy, Read This

- Chasing Permanent Happiness Many years ago, in my early thirties, I started asking myself why I wasn’t truly happy. A vital relationship in my life was in shambles, and a series of unfortunate misunderstandings had left things hopelessly deadlocked. But,...

Meta AI Scandal: Leaked Guidelines Allowed Chatbots to Flirt With Children

https://youtu.be/tSgvsXe-cwE - Want To Listen To The Article Instead? - Meta AI's Perilous Child Chat Guidelines 🚨 Multiple news outlets are reporting on a controversy surrounding Meta AI's internal guidelines for chatbots interacting with...

Smartphones Are Destroying Young Minds Faster Than Any Technology in History

- Smartphones: A Civilizational Threat to Human Cognition 🧠📵 An opinion piece by Colby Hall in Mediaite, titled “Alarming New Study Finds Smartphones Ruining Our Brains at Unprecedented Speed,” is going viral. And rightfully so, because it warns that...

How To Reclaim Your Mind And Keep Your Agency In The Age Of AI And Distraction {Video}

- Want To Listen To The Article Instead? - Reclaiming Agency: Mind, AI, and Digital Distraction 🧠 Hi all, I've used AI to generate this video, but please note, I haven't outsourced my thought but only my task to create this. My original script that has been...

Reclaiming Your Mind: How to Keep Your Agency in the Age of AI and Endless Distraction

- Losing Agency Voluntarily I remember the first time I said the word “Agency” out loud. I was in my mom's kitchen in our village of Poranki looking out of our balcony. Fifty feet across in our neighbor's balcony, an old woman was getting bathed by her son and...

« Older Entries

AI Safety Leaders Destroyed by AI Agents: The Ironic Collapse Everyone Saw Coming

– 0 –

The Future Is Here

The World Of The Transformative Potential Of AI And Robotics

Spread the joy here:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Rachana Nadella-Somayajula