OpenAI's GPT-5.5 Matches Claude Mythos in Cyberattack Capabilities: AI Security Institute

📄Full Article· Automatically extracted by trafilatura4185 words

In brief - GPT-5.5 can autonomously execute sophisticated cyberattacks, completing a 32-step corporate network simulation and cracking a 12-hour security puzzle in just 10 minutes. - Offensive AI cyber capability is rapidly improving across developers, with AISI warning further advances could arrive in quick succession. - Researchers found a jailbreak that bypassed GPT-5.5's safety guardrails entirely, raising alarms. A U.K. government agency has found that OpenAI's newest artificial intelligence model can autonomously carry out complex cyberattacks—and that it cracked a reverse-engineering challenge in just over 10 minutes that took a human security expert roughly 12 hours. The AI Security Institute (AISI), a research body within Britain's Department of Science, Innovation and Technology, published findings Thursday showing that GPT-5.5 is among the strongest models it has evaluated for offensive cyber capabilities, putting it roughly on par with Anthropic’s vaunted Claude Mythos. The report found GPT-5.5 is the second model to complete AISI's most demanding test—a 32-step simulated corporate network attack called "The Last Ones"—doing so autonomously in two out of 10 attempts. The first model to achieve the milestone was Anthropic's Claude Mythos Preview, which completed the simulation in three of 10 tries. The corporate network simulation, built with the cybersecurity firm SpecterOps, requires an agent to chain together reconnaissance, credential theft, lateral movement across multiple Active Directory forests, a supply-chain pivot through a CI/CD pipeline, and ultimately the exfiltration of a protected internal database—steps that AISI estimates would take a human expert around 20 hours. Perhaps the most striking result involved a fiendishly difficult reverse-engineering puzzle. GPT-5.5 solved the challenge—which required reconstructing a custom virtual machine's instruction set, writing a disassembler from scratch, and recovering a cryptographic password through constraint solving—in 10 minutes and 22 seconds, at a cost of $1.73 in API usage. A human expert, using professional tools, required approximately 12 hours. On AISI's battery of advanced cybersecurity tasks, GPT-5.5 achieved an average pass rate of 71.4% on the most difficult "Expert" tier, edging out Mythos Preview at 68.6% percent and significantly surpassing GPT-5.4 at 52.4%. The findings carry pointed implications for the broader trajectory of AI development. AISI concluded that GPT-5.5's performance suggests rapid improvement in cyber capabilities may be part of a general trend rather than an isolated breakthrough—and warned that if offensive cyber skill is emerging as a byproduct of wider improvements in reasoning, coding, and autonomous task completion, then further advances could arrive in quick succession. The report also flagged significant concerns about the model's safety guardrails. Researchers identified a universal jailbreak that elicited harmful content across all malicious cyber queries tested, including in multi-turn agentic settings. The attack took six hours of expert red-teaming to develop. OpenAI subsequently updated its safeguard stack, though a configuration issue prevented AISI from verifying whether the final version was effective. AISI cautioned that its capability evaluations were conducted in a controlled research environment and do not necessarily reflect what is accessible to an ordinary user, noting that public deployments include additional safeguards and access controls. The report lands against a worrying backdrop for British cybersecurity. The U.K. government's annual Cyber Security Breaches Survey, also published Thursday, found that 43% of businesses suffered a cyber breach or attack in the past 12 months. In response, the government announced £90 million in new funding to boost cyber resilience, and said it is moving forward with the Cyber Security and Resilience Bill to protect essential services. Officials also published guidance urging organizations to prepare for a potential surge in newly discovered software vulnerabilities as AI accelerates the pace at which security flaws can be found and weaponized.

Data Status✓ Full text extractedRead Original (Decrypt)

🔍Historical Similar Events· Keyword + Asset Matching6 items

2026-04-23

OpenAI Launches GPT-5.5 to Challenge Anthropic’s Claude Opus 4.7

Similarity 150%關鍵字 gpt/claude/openai

2026-04-23

OpenAI CEO Sam Altman slams Anthropic: Fear-mongering Claude Mythos is just to monopolize AI

Similarity 150%關鍵字 mythos/claude/openai

2026-04-30

OpenAI Rolls Out Advanced Account Security for ChatGPT Users

2026-04-29

The Protocol: Mythos forces crypto industry to rethink security practices