The Accountability Crisis: How to Verify the Work of Agentic AI

As agentic AI systems take on increasingly complex corporate tasks, business leaders face a daunting paradox: they cannot afford to sit out the AI revolution, but they also cannot afford the risks of unchecked "rogue" agents or hallucinations.

At the Fortune Brainstorm Tech conference in Aspen, Colorado, top tech executives converged to discuss the shifting paradigm of AI verification and how companies can maintain true accountability.

The Core Challenge: Introspectability & Fiduciary Trust

For businesses deploying AI in high-stakes environments, accountability means being able to fully retrace an AI’s steps.

1. The Autonomous Driving Standard

Edwin Olson, CEO of May Mobility, emphasizes that perfection is impossible, making transparency the ultimate safety net:

"Because you know it’s going to eventually make mistakes, how do you create the transparency and introspectability, so you can understand why it made a mistake and then talk to regulators about how you know that you fixed that issue moving forward."

2. "Fiduciary-Grade" AI

For Thomson Reuters, which serves professionals in legal and tax compliance, accountability must be baked into the product architecture. Chief Data Officer Caitlin Halferty mandates that users must always have a clear mechanism to validate a model's output. Transparency serves as one of their four product pillars:

Transparent Output
Data Privacy & Security
Subject Matter Expertise
Reliable Content

The Verification Strategy: "LLM as a Judge"

As the volume of AI-generated work outpaces human capacity, leaders are turning to automated, multi-agent oversight. However, a strict rule applies: AI should never grade its own homework.

Elena Kvochko, CEO of Trustguard AI, uses a newsroom analogy to explain the ideal architecture:

Role	Responsibility	System Design
The Writer	Generates code, content, or data analysis.	Primary LLM/Agent
The Editor	Solely hunts for mistakes, hallucinations, and inaccuracies.	Separate, Independent LLM/Agent

This decoupled structure forces continuous self-improvement without creating blind spots or echo chambers within the software.

Looking Ahead: Borrowing from Safety-Critical Industries

The sheer volume of AI output is threatening to break traditional human auditing workflows. Gregor Stewart, Chief AI Officer at SentinelOne, notes that when AI generates massive amounts of content or data, humans face an impossible bottleneck.

"You end up in this space where you’ve got so much work that’s been done, so much work to audit, that you can’t truly be accountable." — Gregor Stewart

The Coding Blueprint

Software development is currently the bellwether for this crisis, operating about a year ahead of other industries. Instead of forcing a human engineer to manually review 10,000 lines of AI-generated code, tech teams are repurposing rigorous verification protocols developed decades ago for safety-critical industries (like aerospace and industrial engineering) and embedding them directly into automated AI workflows.