Gartner has warned that the increasing volume of data generated by AI threatens the future reliability of large language models (LLMs).

So much so, that it predicts that 50% of organizations will implement a zero-trust stance for data governance by 2028 due to the proliferation of unverified AI-generated data.

According to data from a recent 2026 survey of CIOs and technology executives, 84% expect their companies to increase funding for generative AI. As organizations accelerate both the adoption and investment in AI initiatives, the volume of AI-generated data will continue to grow. This means that future generations of LLMs will increasingly be trained on the outputs of previous models, increasing the risk of “model crash,” where AI tools’ responses may no longer accurately reflect reality.

“Organizations can no longer implicitly trust data or assume it was human generated. As AI-generated data becomes pervasive and indistinguishable from human-created data, a zero-trust posture establishing authentication and verification measures, is essential to safeguard business and financial outcomes,” said Wan Fui Chan, executive vice president at Gartner, in a statement.

Chan also pointed out that “regulatory requirements for verifying ‘AI-free’ data are expected to intensify in certain regions.”

“However, these requirements may differ significantly across geographies, with some jurisdictions seeking to enforce stricter controls on AI-generated content, while others may adopt a more flexible approach,” Chan said in the release.

LLMs are typically trained using data extracted from the web, as well as a variety of other sources, including books, code repositories, and research articles. Some of these sources already contain AI-generated content, and if the current trend continues, almost all of them will eventually be filled with AI-generated data.

“In this evolving regulatory environment,” Chan continued, “all organizations will need the ability to identify and tag AI-generated data. Success will depend on having the right tools and a workforce skilled in information and knowledge management, as well as metadata management solutions that are essential for data cataloging.”

As a result, Gartner points out that proactive metadata management practices will become a key differentiator, as they will allow organizations to analyze, alert, and automate decision-making across all their data assets.

Read More