Autonomous AI agents duped into leaking sensitive data in phishing test – CSO Online

AI agents given access to corporate email and business applications could become a new phishing target for attackers, according to cybersecurity researchers, after a test agent built on OpenClaw was tricked into sharing cloud credentials and customer data with an external attacker.

Varonis Threat Labs said it built an OpenClaw AI agent called Pinchy to test whether autonomous agents could fall for the same kinds of phishing attacks that have long targeted employees. Varonis tested the agent in a controlled Google Workspace environment, giving it access to a Gmail inbox with mock AWS credentials, CRM exports, internal conversations, and calendar invites.

The test used two configurations: a generic productivity profile and a stricter profile that included email safety instructions telling the agent to be cautious of phishing and verify sender identities before acting on sensitive requests. Varonis said the agent still failed in some scenarios, particularly when requests appeared to come from colleagues and were framed as routine or urgent business tasks.

“In some cases, Pinchy not only failed at spotting the phishing attacks, it also performed risky actions that could potentially compromise a real-world organization,” the cybersecurity firm said in its report.

In one test, Pinchy forwarded AWS IAM keys, database passwords, and SSH access details to an external Gmail account after receiving what appeared to be a routine request from a colleague for staging credentials.

In another test, an attacker asked the agent to send the latest customer export for a quarterly business review presentation. Pinchy retrieved and forwarded a CRM export containing details on 247 enterprise customers, including company names, contact information, contract dates, customer tiers, and roughly $1.28 million in monthly recurring revenue data.

But the results were not entirely negative. According to Varonis, the agent performed better against more technical phishing attempts, including a malicious OAuth consent flow disguised as a timesheet platform. In that case, Pinchy inspected the redirect address, identified the destination as suspicious, and stopped before granting consent.

“That contrast is what makes the earlier failures structurally important,” Varonis said. “The agent had enough technical reasoning to recognize sophisticated phishing infrastructure. The weak point was social trust and identity verification.”

The findings come as companies move AI agents beyond chat interfaces and into workflows where they can retrieve documents, process messages, and act across business software.

An architecture problem

The OpenClaw test points less to a failure of the AI model itself than to the way the agent was configured and deployed, said Devashri Datta, a cybersecurity researcher.

“The security tests actually proved that the AI models did their jobs well on a purely technical level,” Datta said.

The bigger problem was that the agent treated email as both a source of information and a source of instructions, creating what Datta described as a classic IT mistake: mixing the data lane with the control lane.

“It didn’t hand over a password because someone asked nicely; it executed what looked like a legitimate operational task,” Datta said. “In any secure system, you never let the data path give administrative orders.”

Other analysts said the model should not be taken out of the equation entirely. The risk is not confined to one layer of the technology stack, said Keith Prabhu, founder and CEO at Confidis. The test showed problems in the model’s ability to judge trust and in the way agent frameworks and enterprise governance handled autonomous access.

“Historically, security architectures segregate any orchestration pipeline into authorization, execution, auditing, and escalation,” Prabhu said. “However, this is collapsed into one single pipeline in AI agents, which may lead to them becoming victims of such phishing attacks.”

Enterprises need enforceable controls

Enterprises should treat AI agents as high-privilege identities, because they can ingest untrusted content while also taking actions across business systems, according to Sunil Varkey, a cybersecurity adviser and former CISO.

That combination raises the stakes for enterprises, particularly when agents can read emails, documents, web pages, and SaaS comments while also sending messages, exporting data, calling APIs or updating records, he said.

“Frameworks like OpenClaw often lack robust enforcement of identity verification, tool-level permissions, and resistance to prompt injection,” Varkey said. “However, the decisive factor in the Varonis tests was over-privileged access, missing human oversight, and absent runtime guardrails.”

Akshat Tyagi, associate practice leader at HFS Research, said enterprises should focus not only on what an agent can access, but also on what it is allowed to send outside the organization.

“Instructions are not controls,” Tyagi said. “If an agent can email sensitive data outside the company just because someone asked convincingly, the problem is not the model alone.”

AI agents should have their own identities, with access that can be limited and monitored, Tyagi said. Requests involving credentials or customer data sharing should trigger human review rather than be left to the agent’s judgment.

– Read More

Autonomous AI agents duped into leaking sensitive data in phishing test – CSO Online

An architecture problem

Enterprises need enforceable controls

You Missed

June Patch Tuesday marks a ‘new normal’ with over 200 CVEs, 32 rated ‘critical’ – CSO Online

Who Runs the Ransomware Group ‘The Gentlemen?’ – Krebs on Security

Product demo with Lexsoft Systems: “Turn raw data into diamonds” – Legal IT Insider

Microsoft feud escalates as researcher drops new Windows zero-day – CSO Online

An architecture problem

Enterprises need enforceable controls

Related Post

You Missed