OpenAI said it plans to acquire AI testing startup Promptfoo, a move aimed at strengthening security checks for AI agents as enterprises move toward deploying autonomous systems in business workflows.

Promptfoo’s tools allow developers to test LLM applications against adversarial prompts, including prompt injection and jailbreak attempts, and to evaluate whether models follow safety and reliability guidelines.

In a statement, OpenAI said Promptfoo’s technology will be integrated into OpenAI Frontier, its platform for building and operating AI coworkers.

OpenAI added that the Promptfoo team has built tools used by more than 25% of Fortune 500 companies, including an open-source command line interface and library designed to evaluate and red-team large language model applications. OpenAI plans to continue developing the open-source project while expanding enterprise capabilities within its Frontier platform.

Analysts say the acquisition reflects a broader inflection point in AI agent deployment, with enterprises shifting their focus from raw model capabilities to secure and governed AI systems.

Industry research reflects these concerns. IDC’s 2025 Asia/Pacific Security Study showed that organizations cite AI-enhanced phishing and impersonation attacks such as deepfakes and voice cloning, AI-powered ransomware, and LLM prompt injection or model manipulation among their top concerns.

Additional risks include automated malware creation using AI, AI-driven business logic attacks and disinformation campaigns, as well as model poisoning during training, said Sakshi Grover, senior research manager for IDC Asia Pacific Cybersecurity Services.

“These reflect that enterprises view AI not only as a productivity tool but also as an expanding attack surface,” Grover said. “In this context, the ability to systematically test AI systems for vulnerabilities such as prompt injection, data leakage, and unsafe model behavior becomes essential.”

AI testing becomes baseline

LLMs introduce new types of vulnerabilities that traditional application testing tools were not designed to detect. Companies moving generative AI projects from pilot stages into production are increasingly forced to consider evaluation and red-teaming tools as a core part of their AI development pipelines.

“Red-teaming, governance, and evaluation tools are becoming the new table stakes,” said Neil Shah, VP for research at Counterpoint Research. “Security must be multi-layered, integrated first at the development stage to simulate vulnerabilities, and second during real-time monitoring and prompt execution.”

Many organizations are now adopting testing practices for AI that mirror traditional application security processes, according to Keith Prabhu, founder and CEO of Confidis.

“This ‘shift-left’ approach is used extensively today for application security testing,” Prabhu said. “This tried and tested approach has helped improve the security of the final output. It is logical that AI models and tools will also follow a similar ‘shift-left’ approach to testing.”

System integrators and managed security service providers are also increasingly incorporating AI testing tools into their service offerings, particularly as organizations begin deploying AI-assisted security operations centers.

“In autonomous SOC environments, where AI systems may triage alerts, generate responses, or trigger playbooks, continuous evaluation of model behavior is essential to prevent misuse or operational disruption,” Grover said. “Enterprises are increasingly embedding AI evaluation platforms into DevSecOps workflows so that models, prompts, and agent behaviors can be tested continuously before and after deployment.”

Read More