AI GOVERNANCE

Deloitte's AI Blunder: A Wake-Up Call for Enterprise Governance

A recent incident involving Deloitte Australia highlights the critical need for robust AI governance as enterprises rapidly integrate artificial intelligence.

Read time
5 min read
Word count
1,163 words
Date
Oct 8, 2025
Summary

Deloitte Australia faced scrutiny after AI-generated inaccuracies appeared in a government report, leading to a partial contract refund. This incident underscores a broader challenge for organizations as they accelerate AI adoption without fully mature governance frameworks. Experts emphasize the importance of human oversight, transparent AI usage, and modernizing vendor contracts to explicitly address AI involvement and accountability. The case serves as a critical reminder that comprehensive governance is essential to mitigate risks associated with generative AI and ensure reliability in high-stakes environments.

The evolving landscape of AI adoption requires stringent governance. Credit: Shutterstock
🌟 Non-members read here

A recent incident involving Deloitte Australia has highlighted significant challenges in managing artificial intelligence within large organizations. The consulting firm was compelled to issue a partial refund on an AU$440,000 (US$290,000) government contract after a report it produced for Australia’s Department of Employment and Workplace Relations (DEWR) contained AI-generated falsehoods. This situation exposes critical vulnerabilities that analysts suggest are increasingly common as enterprises quickly integrate AI technologies.

Deloitte utilized OpenAI’s GPT-4o to assist in compiling a 237-page independent review. However, the firm failed to identify fabricated academic citations and non-existent court references before delivering the final document. The use of AI was also not disclosed until after the inaccuracies were brought to light.

The Australian government confirmed that Deloitte acknowledged the incorrect footnotes and references, agreeing to repay the final installment of its contract. A DEWR spokesperson stated that the exact refund amount would be made public once the contract notice on AusTender is updated. The department noted that a corrected version of the report and statement of assurance has since been released.

Charlie Dai, a vice president and principal analyst at Forrester, described the situation as “symptomatic of broader challenges.” He noted that “rapid adoption often outpaces controls and makes similar incidents likely across regulated and high-stakes domains.” Despite the errors, the DEWR spokesperson affirmed that “the substance of the independent review is retained, and there are no changes to the recommendations.”

The Unveiling of AI Fabrications

The discovery of the fabricated content came from an unexpected source: Dr. Christopher Rudge, a researcher from the University of Sydney specializing in health and welfare law. Rudge, while reviewing the report, immediately recognized that some cited authors were colleagues he knew personally, who had not written the works attributed to them. This personal knowledge allowed for a swift and accurate detection of the AI-generated errors.

Dr. Rudge explained that the cited works seemed “almost too perfectly tailored and too bespoke for the text,” which served as an initial warning sign. His domain expertise proved invaluable in catching inaccuracies that automated checks might have missed. This incident underscores the importance of human oversight, particularly from subject matter experts, even in AI-assisted processes.

Sam Higgins, another vice president and principal analyst at Forrester, echoed these concerns, calling the event “a timely reminder that the enterprise adoption of generative AI is outpacing the maturity of governance frameworks designed to manage its risks.” He added that “the presence of fabricated citations and misquoted legal raises serious questions about diligence, transparency, and accountability in consultant-delivered work.” This highlights a growing gap between technological capability and established quality control mechanisms.

The revised report, published recently, now includes a disclosure that was absent in the original. This disclosure acknowledges Deloitte’s use of an AI tool to address “traceability and documentation gaps.” However, this post-hoc disclosure has drawn criticism. Forrester’s Dai noted that firms often categorize AI use as “internal tooling,” which can inadvertently create gaps in transparency and trust with clients. Higgins further argued that Deloitte’s delayed disclosure sets “a poor precedent for responsible AI use in government engagements,” especially considering existing government guidance on AI transparency in decision-making.

Reimagining Accountability and Contracts for the AI Era

The Deloitte case brings to the forefront a crucial discussion about shared responsibility for quality control and the need for modernizing vendor contracts in an AI-driven landscape. Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research, suggested that both the vendor and the client share accountability. He questioned why checks were not performed before the report went public, arguing that “accountability works both ways.” This perspective emphasizes that clients cannot simply delegate all responsibility for AI-generated content.

The disclosure failures and breakdowns in quality control evident in this incident highlight fundamental deficiencies in how organizations currently structure contracts with vendors employing AI tools. Dr. Rudge advocates for institutionalizing subject-matter expert review as a mandatory final quality assurance step for AI-assisted projects. Despite the promise of significant cost and time savings that AI offers, he believes that human vetting remains the gold standard.

Rudge suggested that integrating expert review could still be economically viable. He posits that even with the added cost of a subject matter expert for proofreading, the overall production costs in a knowledge-intensive world could be substantially lower. This balance allows organizations to harness AI’s efficiency while maintaining the integrity and accuracy of deliverables. The expectation is that professionals will continue to be the arbiters of truth and quality.

Gogia pointed out that many existing agreements operate under the assumption of purely human authorship, despite automation underpinning much contemporary work. This discrepancy leads to confusion when issues arise, as both parties then “scramble to decide who carries the blame — the consultant, the model, or the client reviewer who signed it off.” The evolving nature of work necessitates a re-evaluation of contractual terms to clearly define roles and responsibilities when AI is involved.

Tech leaders are now advised to proactively inquire about AI involvement in projects. Dai recommends seeking explicit details on validation steps, error-handling processes, human review protocols, source verification, and accountability for factual accuracy before accepting any deliverables. Higgins added further questions for consideration, such as identifying the specific generative AI tools used, their application within the deliverable, safeguards against hallucinations, human-in-the-loop validation processes, and methods for tracking content provenance. These questions are designed to foster greater transparency and accountability in AI-driven projects.

Establishing Comprehensive Governance Frameworks for AI

Beyond vendor management and contractual obligations, analysts emphasize the critical need for organizations to develop and implement comprehensive governance frameworks for artificial intelligence. AI should be viewed not merely as a productivity tool but as a systemic risk requiring formal policies and cross-functional oversight. This shift in perspective is crucial for mitigating potential reputational and compliance issues.

Dai advises CIOs and procurement teams to incorporate specific clauses into contracts. These clauses should mandate AI disclosure, establish clear quality assurance standards, define liability for AI-related errors, and grant audit rights. Furthermore, organizations should strive for alignment with established frameworks like the NIST AI Risk Management Framework or ISO/IEC 42001 for robust risk management. Such frameworks provide a structured approach to identifying, assessing, and mitigating AI risks.

Higgins also advocates for provisions that require upfront disclosure of AI usage, mandate human review at critical junctures, clearly define liability for any AI-induced errors, and include audit rights to ensure compliance. These measures collectively build a more resilient and responsible approach to AI integration. Treating AI as a systemic risk necessitates a proactive and structured governance strategy.

Gogia envisions an emerging model where joint review boards, composed of both client and vendor representatives, collaboratively examine AI-produced content before its official endorsement. He suggests that “that is what maturity looks like — not the absence of AI, but the presence of evidence.” This collaborative governance approach fosters transparency and ensures that AI-generated content meets predefined quality standards. Ultimately, governance in the AI age will thrive on cooperation and shared understanding, rather than confrontation over responsibility.