What Accountants and Auditors Need to Know About COSO AI Guidance for Generative AI
Generative AI is no longer a future-state concept; many firms and organizations are using it to reshape how accounting teams close the books, how auditors test controls, and how compliance functions monitor risk. But as organizations move quickly to adopt AI tools, it’s also important to ask: are our internal controls keeping pace?
COSO’s recently written guidance paper, Achieving Effective Internal Control Over Generative AI (GenAI) recommends adapting the 2013 ICIF to provide clarity. If you're trying to figure out where AI governance fits within your existing control framework, we’re breaking down this guidance and helping you make sense of it.
Summary
COSO AI guidance doesn't make any changes. It does something arguably more valuable — it takes a framework the accounting profession already trusts and shows exactly how to apply it in an environment where AI is reshaping every assumption about how data flows, decisions are made, and controls operate. For accountants and auditors, that's the foundation you need to govern AI confidently and responsibly.
What Is COSO?— A Quick Refresher
For those newer to the profession, COSO stands for the Committee of Sponsoring Organizations of the Treadway Commission. Founded in 1985, to sponsor the National Commission on Fraudulent Financial Reporting, it's a private-sector joint initiative of five major accounting and finance organizations:
- American Accounting Association (AAA)
- American Institute of CPAs (AICPA)
- Financial Executives International (FEI)
- Institute of Management Accountants (IMA)
- The Institute of Internal Auditors (IIA)
COSO set the standard for designing, implementing, and evaluating internal controls when they published the Internal Control – Integrated Framework (ICIF) in 1992 (updated in 2013 and 2023). If you've ever worked on a SOX audit, documented a control environment, or performed a risk assessment, you've been working within the COSO framework. It defines five components of internal control — Control Environment, Risk Assessment, Control Activities, Information and Communication, and Monitoring Activities — and 17 supporting principles that apply universally across organizations and industries.
What makes COSO's framework so durable is that it describes what must be present for reliable operations, reporting, and compliance, not which specific technologies to use. That flexibility is exactly why it translates so well into the AI era.
What Is Generative AI (and Why Does It Require Special Attention)?
Before diving into the guidance, it's worth grounding ourselves on what makes generative AI distinct from other technologies you or your team may already govern.
Generative AI (GenAI) is a subset of machine learning that can create content, like text, images, or synthesized data, when prompted, and often appears as human-produced work. When you use tools like ChatGPT, Microsoft Copilot, or AI agents that write emails or reports, draft reconciliations, and execute multi-step financial workflows with minimal human input, that’s generative AI.
Unlike traditional automation (like RPA bots that follow fixed rules), GenAI is:
- Probabilistic, not deterministic — it generates likely outputs
- Dynamic — models, prompts, and data sources evolve frequently
- Easily scalable — once implemented, it’s relatively easy to iterate to complete a variety of tasks
- Low barrier to entry — employees can begin using AI or even create a tool or agent with little to no training.
These factors can be beneficial, but it’s also important to recognize that AI can hallucinate outputs , evolve without visible notice from vendors, and occur outside of IT oversight.
That last point is especially relevant for internal auditors and controllers: by the time you're testing a control, an AI tool may have already been in production for months without formal governance.
What the New COSO AI Guidance Covers
COSO's new publication doesn't create a new framework from scratch. Instead, it maps the existing 2013 ICIF principles directly onto GenAI-specific risks and control requirements. That's a smart move — it means organizations can extend what they already have rather than building governance from zero.
The guidance introduces three major innovations worth knowing:
A Capability-First Taxonomy
Rather than organizing guidance by vendor or product name (a moving target), COSO defines eight GenAI capability types based on what the AI actually does in your organization:
Capability Type | Example |
Data extraction and ingestion | Extracting key fields from vendor invoices or contracts |
Data transformation and integration | Normalizing data across multiple platforms before analytics |
Automated Transaction Processing | Auto-matching purchase orders to supplier invoices |
Workflow Orchestration | AI agents that run month-end close tasks end-to-end |
Judgment, Forecasting & Insight | Generating cash flow projections or market summaries |
AI-Powered Monitoring | Scanning transactions for fraud or anomalies in real time |
Knowledge Retrieval & Summarization | Condensing new regulatory guidance for compliance teams |
Human-AI Collaboration | Employees using Microsoft Copilot to draft emails and reports or using Claude for coding |
This taxonomy matters for auditors because the risk profile — and therefore the control requirements — differ significantly depending on which capability type is in play. An AI that auto-posts journal entries carries very different risks than one that summarizes regulatory updates.
Audit-Ready Control Mapping
For each capability type, the guidance provides minimum control expectations tied to all five COSO components, along with illustrative metrics and the artifacts auditors may reasonably look for as evidence. This directly bridges the gap between governance frameworks and audit requirements, which is a gap that has been a real frustration for practitioners trying to test AI-related controls.
A Practical Implementation Roadmap
The guidance provides a six-step cyclical roadmap for embedding GenAI governance into everyday operations:
Establish AI governance structure: Form a cross-functional committee (legal, compliance, IT, risk, finance) with defined authority
Inventory GenAI use cases: Catalog all active and planned AI tools, including shadow AI, classified by capability type and risk level.
Assess risks by COSO component: Use scenario-based analysis to surface GenAI-specific threats like hallucinations, model drift, prompt injection, and vendor dependency.
Design and map controls: Build preventive, detective, and corrective controls and link them to COSO principles and KPIs.
Implement and communicate: Deploy controls, train users, and establish escalation paths.
- Monitor and adapt: Continuously track performance metrics and trigger reviews when thresholds are breached.
The roadmap is intentionally circular — when Step 6 is complete, you go back to Step 1 and reassess, because the technology keeps changing.
Applying the Five COSO Components to GenAI
Here's how the guidance applies each component to GenAI, with the highlights most relevant to accounting and audit professionals:
Control Environment
The control environment sets the tone for how GenAI will be governed. Key requirements include:
- A GenAI Acceptable Use Policy (AUP) that explicitly prohibits certain data types (PII, PHI, and other regulated or sensitive information) and sets ethical boundaries around prohibited use cases like employment decisions.
- Formal oversight bodies, such as a responsible AI committee with representation from legal, compliance, IT, and risk, that provide visibility to the board on AI adoption and incidents.
- Named owners for every AI tool or platform, with a RACI matrix covering prompts, system configurations, retrieval datasets, and transformation rules.
- Role-specific training — not just general AI awareness, but technical training for engineers (secure prompting, bias mitigation) and interpretive training for managers (reading model performance metrics).
For auditors: when scoping engagements that touch GenAI, start by asking whether these foundational elements exist. If ownership is unclear or the AUP is absent, every downstream control is built on sand.
Risk Assessment
COSO AI guidance emphasizes that traditional, annual risk assessments aren't sufficient for GenAI. Risk assessment must be continuous because model updates, data changes, and vendor configuration changes can materially alter the risk profile between review cycles.
Key risks to assess for each GenAI use case include:
- Hallucinations: Factually incorrect outputs presented with confidence
- Model drift: Gradual degradation in accuracy over time, often without visible warning
- Prompt injection: Manipulated inputs designed to exfiltrate data or hijack processes
- Third-party dependency: Limited visibility into vendor model updates and control processes
- Bias and fairness: Training data that embeds discriminatory patterns into outputs
- Shadow AI: Unauthorized deployments operating outside formal IT controls
- Deepfakes and synthetic records: GenAI-specific fraud vectors that existing fraud risk assessments may not cover
Living risk registers (updated when configurations change, not just annually) and embedded KRI dashboards are the expected output of this component.
Control Activities
This is where the guidance gets most specific and useful for practitioners designing or testing controls.
COSO emphasizes treating GenAI outputs as claims requiring validation, not facts to accept by default. Practically, that means:
- Human-in-the-loop (HITL) requirements proportionate to risk: Full re-performance for high-stakes outputs, risk-based sampling for routine processes
- Confidence thresholds: Auto-posting or auto-processing only occurs when the AI's output meets a validated accuracy threshold; everything else routes to a human review queue
- Change control discipline: Prompts, thresholds, and retrieval configurations are treated like any other IT configuration item: version-controlled, approved, and logged
- Segregation of duties: The person who configures AI settings should not be the same person who approves or reviews outputs
The guidance also introduces a useful concept of AI reliance. Reliance occurs when management depends on AI output as the primary evidence supporting a control's design or operating effectiveness. When reliance exists, evidence standards rise — documented prompt and model version, clear sampling rationale, and retained audit trail.
For external auditors specifically, this distinction is critical for determining how to scope and test controls that involve AI.
Information and Communication
Quality information flows are essential because AI outputs, without context, can be misinterpreted or over-relied upon. The guidance requires:
- Capturing prompts, inputs, outputs, model versions, and confidence scores so outputs can be validated and traced
- Tailored internal communication so operators, reviewers, managers, and governance bodies each receive the right level of detail
- External disclosures when GenAI materially affects customers, regulators, or investors — including transparency about limitations
Practically, this means organizations should consider maintaining centralized prompt libraries, model cards, and retrieval knowledge bases in controlled repositories with role-based access.
Monitoring Activities
Monitoring for GenAI, goes beyond confirming that controls exist; it requires verifying they remain effective as underlying technology changes.
Key expectations include:
- Continuous monitoring of KPIs like accuracy, precision/recall, exception volumes, and bias metrics
- Separate, periodic evaluations (model effectiveness audits, adversarial simulations, independent challenge sessions) to catch gradual issues like model drift that ongoing monitoring might miss
- AI control deficiency playbooks that map common GenAI failures to pre-agreed corrective actions
- Multi-metric AI tolerances, meaning that rather than simple pass/fail, organizations should establish acceptable ranges across dimensions like task accuracy, data leakage tolerance, bias levels, and model change velocity
When a deficiency is identified, root cause analysis must now account for a new class of causes: configuration errors, retrieval failures, prompt design issues, vendor changes, and data quality degradation.
Three Real-World Cases Worth Knowing for COSO AI Guidance
The guidance includes integrated case examples that show how risks and controls interact across capability types:
The Disappearing Clause
A legal team's GenAI contract extractor performed well on clean PDFs but failed on scanned faxes, creating a silent accuracy gap that nearly affected legal decisions. The fix required coordinating ingestion controls, reviewer competence requirements, and monitoring feedback loops.
The Vanishing Accrual
An AI-driven accrual process missed a monthly expense when a supplier shifted from monthly to quarterly billing. The model didn't detect the pattern change. Month-end variance analysis caught it, but not before delaying close. The lesson: automated transaction controls need pattern-change alerts and controller review gates built in.
The Over-Helpful Assistant
A corporate communications team's AI assistant drafted shareholder letters that inadvertently included non-public financial forecasts. Prompt filters, output guardrails, and a human review gate before any external sharing were the controls that closed the gap.
These aren't hypothetical edge cases, they're the types of issues audit teams are already encountering in practice.
What This Means for Your Practice
Whether you're in internal audit, external audit, a controller's role, or a risk and compliance function, COSO AI guidance has direct implications for how you work:
- Internal auditors should be inventorying AI use cases now, classifying them by capability type, and building AI-specific work programs that align to the ICIF principles covered in this guidance.
- External auditors need to expand their understanding of how clients are using GenAI in processes that touch ICFR, and what evidence standards apply when AI reliance exists.
- Controllers and finance managers should be treating prompts, thresholds, and AI configurations as controlled system settings, rather than informal user choices.
- Risk and compliance teams need living risk registers and continuous monitoring infrastructure, not annual risk assessment cycles, to keep pace with AI's rate of change.
The guidance is clear on one point: the organizations that embed GenAI governance into their control environment now are the ones that will realize AI's benefits while avoiding the costly, reputational, and regulatory risks of uncontrolled adoption.
Learn More About Using AI in Accounting with Becker
Becker has a wide selection of AI CPE courses designed to help you build foundational knowledge and better understand responsible AI usage. Check out these courses:
- AI Fundamentals: Understanding the Basics
- Navigating AI Ethics: Balancing Innovation and Responsibility
- Artificial Intelligence for Accountants
About this article: This post is based on COSO's 2026 publication Achieving Effective Internal Control Over Generative AI (GenAI), authored by Scott Emett (Arizona State University), Marc Eulerich (University of Duisburg-Essen), Jason Guthrie (EY), Jason Pikoos (Meta Platforms), and David A. Wood (Brigham Young University), and commissioned by the COSO Board.