AI Governance Field Notes

There is a shift happening in software development that most professional service firms have not yet fully absorbed.

In its most advanced form, AI is no longer just helping write software code. It is doing the entire cycle. A system takes a problem, generates a solution, tests that solution against defined evaluation criteria, corrects itself, and produces a working result. No human writes the code. No human reviews the code.

The human defines what success looks like. The system does the rest.

Dan Shapiro defines this model through “levels” of AI coding(1), where the highest level is a fully autonomous environment. Some refer to it as a “dark factory.” Not because it is hidden, but because it can run without human intervention.

That structure matters more than the technology. Because once you understand the structure, you can see where it goes next.

Now imagine that same model applied inside a law firm. Take a personal injury case involving thousands of pages of medical records.

Today, that work flows through multiple human layers. A nurse paralegal reads the records and flags inconsistencies. A lawyer reviews those findings, interprets them, and begins shaping an argument. Drafts are written, reviewed, revised, and challenged internally before they are ever presented externally.

That entire workflow exists because humans are limited. We miss things. We interpret differently. We require redundancy to achieve confidence.

Now separate the work into three distinct layers:

First, defining what matters.
Second, producing the work.
Third, evaluating whether the result is correct and useful.

Applying the model to the law firm, the lawyer defines what matters. Not by reading records, but by specifying what should be found. For example, identifying deviations from standard of care under specific conditions, detecting timeline inconsistencies beyond a defined threshold, or surfacing contradictions between clinical notes and billing records.

That becomes structured intent. The system then performs the production. It reads every record, extracts facts, builds timelines, flags inconsistencies, and drafts arguments. It does in minutes what previously required days of human effort.

But the real shift happens in the third layer. A separate evaluation system tests the output. It challenges conclusions, looks for missed issues, checks internal consistency, and simulates how opposing counsel would attack the argument. It is not trying to produce the work. It is trying to break it.

This replaces much of what we currently call review.

At that point, several familiar roles begin to change. First-pass document review becomes largely automated. Pattern recognition at scale becomes system-driven. Initial drafting becomes machine-generated. Multi-layer internal review cycles compress significantly.

These functions do not disappear because they are unimportant. They disappear because they existed to compensate for human limitations.

What remains, and becomes more important, is judgment.

The system can surface dozens of potential issues in a case. Only a few will actually matter. Deciding which ones carry legal weight is still human.

Framing remains human. The lawyer determines what story is being told and what gets emphasized.

Responsibility remains human. The court does not accept that an AI system produced an argument. Counsel stands behind it.

Strategy remains human. The system operates within defined parameters. The lawyer operates in ambiguity, incomplete facts, and adversarial dynamics.

The role of the lawyer shifts. Less time is spent reading and writing. More time is spent defining what the system should look for, and deciding whether the results are meaningful.

In practical terms, the lawyer becomes both the architect of the inquiry and the judge of the output. This leads to a simple but important conclusion.

Firms that master the evaluation criteria become the firms that control quality.

Most firms will adopt AI for production because it is immediate and measurable. It reduces time and cost. Far fewer will invest in how outputs are evaluated. That work is harder. It requires clarity about what matters and why. But that is where advantage will emerge.

Because when production becomes commoditized, quality is determined by how well results are tested, challenged, and refined.

This is not a distant scenario. It is a structural shift already underway in another domain. The question is not whether it will reach law and accounting.

The question is how quickly firms recognize that the center of gravity is moving from execution to judgment, and whether they are prepared to operate there.

1. Dan Shapiro's Blog post (1/23/2026): "The Five Levels: from Spicy Autocomplete to the Dark Factory. link

Rex C. Anderson
Desert Sage AI
AI Governance for Law Firms and Accounting Firms

__________________________________

Forwarded to you?
Subscribe for direct delivery of future issues.

AI Governance Field Notes

When AI Stops Assisting and Starts Operating - Issue #6

Join Our Newsletter