
EVMbench, developed by OpenAI and Paradigm, offers multi-mode testing of AI agents’ security capabilities for Ethereum contracts, using curated vulnerabilities amid heightened DeFi exploit activity.
On February 19, OpenAI and Paradigm unveiled EVMbench, a tool designed to evaluate AI agents’ performance in Ethereum smart contract security tasks such as vulnerability detection and patching. Built on 120 curated samples from past incidents, EVMbench includes multiple test modes to simulate real-world DeFi exploits. The framework aims to standardize AI security assessments in response to a rise in protocol breaches, including recent attacks on Moonwell and CrossCurve.