OpenAI Demonstrates AI Draining Crypto Wallets on Purpose

NextFin News - In a demonstration that has sent ripples through both the cybersecurity and decentralized finance (DeFi) sectors, OpenAI, in collaboration with crypto investment firm Paradigm, released a research paper on February 19, 2026, detailing the capabilities of its latest specialized model, GPT-5.3-Codex. According to eWeek, the demonstration utilized a new benchmarking framework called EVMbench, designed to evaluate how effectively AI agents can identify, remediate, and exploit vulnerabilities within Ethereum Virtual Machine (EVM) smart contracts—the digital backbone of over $100 billion in global crypto assets.

The results were stark: GPT-5.3-Codex achieved a 72.2% success rate in exploit tasks, effectively draining funds from vulnerable contracts in nearly three out of four attempts. This represents a massive leap in capability compared to the standard GPT-5 model, which scored only 31.9% in similar tasks just six months ago. Perhaps most alarming was a case study involving GPT-5.2, which autonomously discovered and executed a complex "flash loan" attack—a multi-step financial exploit—draining a test vault's entire balance in a single transaction without any human intervention or step-by-step instructions.

While OpenAI is framing these findings as a defensive necessity, providing $10 million in API credits to cybersecurity researchers and expanding its "Aardvark" security research agent beta, the data suggests a widening "asymmetry gap" in AI capabilities. The benchmark revealed that while AI is highly proficient at attacking (72.2%), it remains significantly less effective at defending; the best-performing models only identified approximately 46% of vulnerabilities and successfully patched only 39% without external hints. This 33-percentage-point gap between offensive success and defensive detection underscores a critical vulnerability in the current digital asset infrastructure.

The rapid evolution of these models suggests that the bottleneck in cyber-offense is no longer technical skill, but rather the speed of search and discovery. When provided with even a minor hint regarding a vulnerability's location, the AI's patching success rate jumped from 39% to 94%. This indicates that the reasoning capabilities of the models are already near-perfect; the challenge lies in the autonomous "hunting" phase. As U.S. President Trump’s administration continues to push for American dominance in both AI and digital assets, the intersection of these two technologies is becoming a primary theater for national security and economic stability.

From an analytical perspective, the 72.2% exploit rate signals the end of the "security by obscurity" era for smart contracts. Historically, many vulnerabilities remained unexploited because the manual effort required to audit thousands of lines of Solidity code was prohibitive for all but the most sophisticated human hackers. AI agents like Codex remove this friction, allowing for the industrialization of exploits. We are moving toward a future where every line of code deployed to a public blockchain will be instantly and continuously audited by adversarial AI agents looking for the slightest logic flaw.

The economic impact of this shift cannot be overstated. If AI-powered offense continues to outpace defense, the insurance premiums for DeFi protocols will likely skyrocket, and institutional capital may retreat from "hot" wallets toward more cumbersome cold storage solutions, hindering the liquidity and utility of the ecosystem. To counter this, the industry must move toward "agentic security"—deploying AI defenders that operate with the same autonomy and speed as the attackers. According to Harvey, the lead writer at The Neuron, the race is now a matter of who adopts these agentic tools faster: the defenders or the thieves.

Looking forward, we expect a surge in demand for "AI-hardened" smart contracts. This will likely involve a new development lifecycle where code is not considered "production-ready" until it has survived a gauntlet of adversarial AI simulations like EVMbench. Furthermore, the emergence of "agentic contracts"—smart contracts that incorporate an AI reasoning layer to evaluate the intent and safety of a transaction before execution—may become the new standard for high-value vaults. As OpenAI continues to refine its Codex series, the boundary between a helpful coding assistant and a sophisticated financial predator will continue to blur, forcing a total re-evaluation of trust in the digital age.

Explore more exclusive insights at nextfin.ai.

OpenAI Demonstrates AI Draining Crypto Wallets on Purpose

Insights

What is EVMbench and its role in AI vulnerability assessment?

How has the GPT-5.3-Codex model improved cybersecurity capabilities?

What feedback has the cybersecurity community provided regarding OpenAI's findings?

What recent updates have been made in AI models related to cybersecurity?

What potential impacts could AI-driven exploits have on the DeFi ecosystem?

What are some core challenges faced by cybersecurity in relation to AI advancements?

How does the exploit success rate of GPT-5.3 compare to previous models?

What historical vulnerabilities in smart contracts have been highlighted by AI advancements?

What are the implications of the 'asymmetry gap' in AI capabilities?

What future trends can we expect in AI security tools for smart contracts?

What controversies surround the use of AI in exploiting vulnerabilities?

How might institutional investors react to the rise of AI-driven exploits?

What advancements are expected in the development lifecycle of smart contracts?

How do agentic contracts represent a shift in smart contract security?

What role does the U.S. government play in the intersection of AI and digital assets?

What might the future landscape of cyber defense look like against AI attackers?

How does the success rate of AI in offensive tasks compare to its defensive abilities?