Security flaws found in AI coding agents expose enterprise secrets through prompt injection
Researchers discovered vulnerabilities in AI coding tools from Anthropic, Google, and Microsoft that could leak API keys and sensitive data.

Security researchers at Johns Hopkins University have identified critical vulnerabilities in AI coding agents from major technology companies that could expose enterprise secrets through prompt injection attacks. The researchers successfully extracted API keys and sensitive information from Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and GitHub's Copilot Agent using a single malicious instruction embedded in a GitHub pull request title.
The vulnerability, dubbed "Comment and Control" by researcher Aonan Guan and colleagues Zhengyu Liu and Gavin Zhong, exploits how AI coding agents process untrusted input in development workflows. When the researchers opened a GitHub pull request with a malicious instruction in the title, the AI agents posted their own API keys as comments without requiring any external infrastructure. The attack works because these agents cannot distinguish between legitimate developer instructions and malicious commands embedded in pull request data.
All three companies have patched their systems after being notified of the vulnerabilities. Anthropic classified the issue as Critical with a CVSS score of 9.4, while Google and GitHub also awarded bug bounties. However, none of the companies have issued formal CVE advisories through standard vulnerability databases, making it difficult for enterprise security teams to track the risks.
The disclosure highlights broader challenges with AI agent security as these systems gain access to increasingly sensitive enterprise resources. Current AI safety measures typically focus on preventing models from generating harmful content, but do not extend to controlling the actions agents can perform, such as executing code or accessing system credentials. Security experts warn that as AI agents become more autonomous and long-running, traditional security frameworks designed for human-operated systems may prove inadequate.
Separately, other developments in AI agent technology are pushing the boundaries of what these systems can accomplish. Moonshot AI announced its Kimi K2.6 model, designed for continuous execution over hours or days, with internal testing showing agents running autonomously for up to five days while handling monitoring and incident response tasks. The company claims the model can coordinate up to 300 sub-agents across thousands of simultaneous steps, representing a significant advance in agent orchestration capabilities.
Meanwhile, Mozilla has been using Anthropic's restricted Mythos model to identify and fix software bugs in Firefox, finding 151 vulnerabilities through AI-assisted security analysis. These developments underscore both the potential and risks of increasingly capable AI agents in enterprise environments, as organizations balance the productivity benefits against new security challenges posed by autonomous AI systems with extensive system access.