Penetration tests of AI-based systems are revealing a greater percentage of high-risk flaws than those discovered in legacy systems.
Security consultancy Cobalt’s annual State of Pentesting Report reveals that 32% of all AI and large language model (LLM) findings are rated as high risk — nearly 2.5 times the rate (13%) of severe flaws found in enterprise security tests more generally.
LLM vulnerabilities also have the lowest resolution rate of all app types pen-tested, with just 38% of high-risk issues fixed, according to data collected during pen tests conducted by Cobalt.
Furthermore, one in five organizations surveyed by Cobalt reported experiencing an LLM security incident in the past year, with a further 18% “unsure” and 19% preferring not to answer.
Third-party security experts quizzed by CSO say Cobalt’s findings align with what they’ve seen on the ground.
“AI systems are being rolled out quickly, but often without the same mature security controls, testing discipline, and governance applied to conventional enterprise software,” says Benny Lakunishok, CEO and co-founder of Zero Networks. “That naturally increases the share of serious findings.”
William Wright, CEO of penetration testing firm Closed Door Security, argues that the main issue comes from vibe coders writing systems.
“AI only does what it’s told, for the most part, and systems that get deployed are usually cobbled together by people without the technical knowledge,” Wright adds. “The same people then are expected to fix the issue, so it’s a vicious circle.”
David Girvin, AI security researcher at Sumo Logic, agrees.
“LLM-driven systems are showing a higher percentage of high-risk findings because we’ve essentially taken a probabilistic engine, plugged it directly into business workflows, and hoped it behaves,” he says. “That’s not a security strategy.”
Emerging attack surfaces, larger blast radius
The top concern is prompt injection, now ranked by OWASP as the No. 1 risk for LLM applications, with reports on bug bounty platform HackerOne surging more than six-fold (540%) year over year.
“While the headline issue is prompt injection, the broader concern here is whether attackers can use the model as an entry point to bypass guardrails, leak data, manipulate decisions, or trigger unintended behavior across integrated workflows,” says Taegh Sokhey, staff project manager for AI security at HackerOne.
Experts say there are several main reasons AI systems tend to generate a higher percentage of high-risk vulnerabilities:
- AI systems introduce newer attack surfaces many organizations are still learning to defend. These risk vectors include prompt injection, insecure plug-ins, data leakage, model supply-chain risk, unsafe agent behavior, excessive permissions, and over-trusted integrations with internal systems.
- The blast radius for AI system flaws can be much larger when something goes wrong. Many LLM deployments are connected to internal knowledge bases, workflows, code repositories, customer data, or privileged tools. That means a single weakness can expose multiple systems.
- AI system vulnerability remediation ownership is often fragmented. “AI initiatives typically span engineering, security, legal, procurement, and business teams,” according to Zero Networks’ Lakunishok. “That slows fixes and helps explain why remediation rates are lower than for traditional applications.”
No remediation playbook
Adrian Furtuna, founder and CEO at Pentest-Tools.com, underscores that Cobalt’s finding of low remediation rates for LLMs and AIs is more telling than the high-risk rate.
“A 38% fix rate for high-risk LLM findings is low even by the standards of application security, where remediation has always lagged discovery,” Furtuna says. “What that gap reflects is that development teams don’t yet have established patterns for fixing LLM vulnerabilities the way they do for, say, SQL injection or XXE [XML External Entity injection].”
When a developer sees a traditional system injection issue, they know the remediation playbook, but there is no established procedure for resolving flaws in AI-based systems.
“When they see a prompt injection chain or an insecure tool call boundary, they often don’t [have a playbook], and that uncertainty stalls action even when the severity rating is clear,” Furtuna notes.
Architecture and maturity factors also play a role in AI systems throwing up a greater percentage of high-risk vulnerabilities. Moreover, LLM integrations concentrate trust in ways that traditional application components avoid. As a result, the attack surface broadens, and trust boundaries are often implicit rather than explicitly enforced, magnifying the impact of any flaws, Furtuna says.
“A model that has access to internal tools, retrieval pipelines, and external APIs represents a large-radius blast zone if its input handling is weak,” he adds. “Prompt injection in that context isn’t a nuisance — it’s a path to data exfiltration, privilege escalation, or supply chain manipulation, depending on what the model can reach.”
Secure development practices for LLM integrations are still forming, an immaturity or knowledge gap that shows up directly in pen test findings.
“The OWASP LLM Top 10 is relatively recent,” Furtuna explains. “Most developers building on top of foundation models are doing so without the equivalent of decades of institutional knowledge about input validation, output handling, and authorization boundary design that exists for web applications.”
LLMs collapse trust boundaries — lacking the predictable input/output flows of regular legacy apps — a problem compounded by the wide-ranging permissions routinely granted to AI systems.
“Most organizations try to secure agents and LLM systems at the identity layer, give the model a role and hope guardrails hold,” says Sumo Logic’s Girvin. “But if an attacker can steer the model — prompt injection, social engineering, etc. — they inherit its permissions. That’s why the impact spikes.”
HackerOne’s Sokhey adds: “AI applications are producing a disproportionate number of high-risk issues because they create an entirely new layer of attack surface, one that is non-deterministic, rapidly changing, and often connected to sensitive data, internal systems, and autonomous actions.”
Countermeasures
Experts advise CISOs to stop skipping security hardening in a rush to implement AI and instead treat AI systems as production systems rather than experiments.
“That means threat modeling before deployment, red teaming and adversarial testing throughout the lifecycle, least-privilege access for models and agents, strong identity controls, segmentation around sensitive data, continuous monitoring, and rapid containment mechanisms when abnormal behaviour is detected,” says Zero Networks’ Lakunishok.
Pentest-Tools.com’s Furtuna argues that established best practices can be applied to the new architecture of LLMs provided they are deliberately designed into the systems from the get-go rather than bolted on as an afterthought.
“Strict tool call schemas, explicit output validation before downstream actions execute, human approval gates on high-consequence operations, and minimal privilege for model-accessible integrations all limit what a successfully exploited prompt injection can actually reach,” Furtuna says.