OpenAI's GPT-5.5 completed an end-to-end simulated corporate network intrusion, matching the cyberattack capabilities previously demonstrated by Anthropic's Claude Mythos. The AI Security Institute reported this result, marking the second AI system to execute a full-chain penetration test without human intervention.
The test involved navigating network defenses, escalating privileges, and exfiltrating data across a sandboxed corporate environment. GPT-5.5 accomplished each stage autonomously, indicating advanced reasoning and tool-use abilities that extend beyond language generation into active system exploitation.
This development intensifies debate around AI safety boundaries. Both OpenAI and Anthropic have deployed frontier models capable of executing attacks that typically require skilled human operators. The findings suggest leading AI labs face pressure to balance capability advancement with containment research.
Researchers stressed that controlled testing environments differ from real-world attack scenarios, where detection systems and human defenders operate. Still, the results underscore risks inherent in systems that combine language understanding with the ability to chain complex actions across networked infrastructure.
The AI Security Institute's report does not specify whether OpenAI implemented additional safeguards in GPT-5.5 relative to earlier versions, or whether the model required specific prompting to attempt the intrusion.
