For decades, documents have been the fundamental unit of work in Pharmaceutical and Biotech...
Designing Risk-Based Compliance for AI: From Validation to Defensibility
Artificial intelligence has entered regulated life sciences at a moment when regulation, technology, and operating models are evolving faster than they have in decades.
On one hand, regulators are permitting the use of AI across research, development, safety, and regulatory operations. On the other, they are establishing more guard rails through new regulatory frameworks, from such as the EU AI Act, and emerging FDA guidance, to and the recommendations of CIOMS Working Group XIV., These initiatives are introducing explicit expectations around how AI must be controlled, documented, and governed. These The frameworks emphasize transparency, risk-based classification, and human oversight, particularly in high-impact and safety-critical contexts.
The result is a growing sense of uncertainty and unanswered questions inside biopharmaceutical organizations.
-
Is AI permitted if it is not formally validated?
-
Must AI conform to traditional CSV, GAMP 5, and 21 CFR Part 11 expectations?
-
Do probabilistic systems even lend themselves to those frameworks at all?
-
And if not, what replaces them?
These are not academic questions. They are actively shaping how MAHs, QPPVs, and compliance leaders are making decisions today.
AI in the Context of Traditional Validation Models
Much of this uncertainty stems from a simple but critical reality: AI does not behave like traditional algorithmic software.
Conventional systems are deterministic. Given the same inputs, they produce the same outputs. They can be validated exhaustively through input/output traceability matrices, scripted test cases, and repeatable evidence. 
Modern AI does not operate this way.
AI systems are probabilistic by design. Their outputs are shaped not only by code, but by data, context, and evolving internal representations. Attempting to force AI into deterministic behavior purely for the sake of validation is technically possible,—but it fundamentally compromises the most transformational elements of performance, adaptability, and learning capacity that makes AI so valuable in the first place.
This creates a paradox: the more rigorously AI is constrained to behave like traditional software, the less it behaves like AI at all.
This does not mean that AI cannot or should not be constrained where appropriate, but when those constraints are solely for the sake of traditional validation, the result can undermine its intended value.
The Emerging Question: Is AI a Tool or a Digital Resource?
This has led many organizations to confront a deeper and more fundamental question: how should AI be treated within regulated environments? As a traditional system component? As a specialized tool? Or as something closer to a digital resource that operates alongside human actors?
Human resources, for example, are not “validated” as tools. They are qualified for defined roles, trained against expected tasks, supervised by accountable owners, measured through performance metrics, and governed by risk-based controls. Trust in human performance is not established through deterministic testing, but through outcomes, operational qualification, oversight, and accountability.
In practice, AI increasingly exhibits similar characteristics. It performs tasks, requires supervision, improves through training and experience, and must be constrained by clearly defined boundaries. The risks associated with AI are therefore not managed through test scripts alone, but through how it is controlled in operation: through human oversight, usage transparency, and continuous performance monitoring.
In this sense, AI support begins to resemble a digital coworker more than a traditional system component, not as a regulatory conclusion, but as a useful lens for rethinking how responsibility, control, and trust may need to evolve.
From Validation to Governed Use
None of this suggests that AI should operate without rigor, documentation, or regulatory accountability.
Integration points between the AI and the remaining software require high attention during the validation of the underlying system. Demonstrating robustness against AI models possible error codes, outages and information types (free text vs. list values) is key. Beyond that, the central risk is not that AI is insufficiently validated as software. The real risk is that AI is deployed without a clearly defined intended use, without enforceable boundaries, without a documented, risk-based assessment of where it is appropriate, and without accountable oversight and auditable transparency.
When those elements are in place, regulators have repeatedly shown openness to the use of AI, even when it is not validated in the traditional CSV sense alone, provided it is properly risk-assessed, governed, and supervised in real-world operation.
This is not because standards are weak. It is because control can exist at the operational level, not only at the system level.
This shift has profound implications not only for how AI is controlled, but for who is responsible for that control.
The Responsibility Gap: Vendor vs. MAH vs. Operations
This evolving landscape has exposed a critical ambiguity in accountability.
What is the responsibility of the software vendor with respect to AI qualification, documentation, explainability, and where does their responsibility end and the MAH’s responsibility begin?
AI doesn’t operate in isolation. It is embedded within larger platforms that combine deterministic workflows with probabilistic intelligence. This hybrid reality is precisely why traditional validation can neither be discarded nor universally applied. Deterministic components of enterprise systems, workflow engines, audit trails, access controls, and integration points, must still be validated using established CSV and Part 11 approaches. At the same time, AI-driven components within those same systems require a different mode of qualification and control, grounded in risk, oversight, and operational governance rather than fixed input/output determinism.
Compliance in AI-enabled platforms therefore becomes layered, not binary: conventional validation where determinism applies, and risk-based governance where intelligence introduces variability.
Accountability, in turn, cannot be assigned to a single party or reduced to a single artifact.
Vendors are responsible for providing transparency into how AI operates, ensuring architectural auditability, and designing system-level controls that make oversight possible. But vendors do not control how AI is ultimately used.
That responsibility lies with the MAH and its operational partners, who determine fit-for-purpose how AI is configured, supervised, monitored, and constrained in real-world workflows. Outsourcing partners, in turn, are responsible for executing those controls consistently and maintaining the evidence that supports them. 
In the age of AI, compliance becomes a shared system of responsibility rather than a deliverable.
Against this backdrop, MAHs and QPPVs are expected to ensure regulatory defensibility while operating in an environment where guidance is evolving. Expectations are still forming, and technology continues to move faster than policy; audits themselves are increasingly informed by AI-driven analytics and risk prioritization.
The result is that organizations must be able to explain, defend, and stand behind their AI-enabled decisions when questioned by inspectors. That is the real compliance challenge of AI, —not validation, but defensibility.
Human-in-the-Loop (HITL) and Responsibility by Design
In AI-enabled systems, control must be distributed intelligently between machines and humans. This is what Human-in-the-Loop truly means—not that humans manually review everything, but that accountability is architected into where and how decisions occur.
In a responsibility-by-design model, AI is permitted to operate autonomously within explicitly defined boundaries, while humans retain ownership of decisions that carry regulatory, medical, or ethical risk.
These boundaries are not informal conventions. They are encoded into workflows, permissions, escalation paths, and audit structures, and grounded in explicit risk assessments that determine where AI may act independently, where it must defer to human judgment, and what quality controls and review mechanisms must apply in each case. This responsibility-by-design model sheds a new light on transparency and explainability as AI supported outcomes get promoted from being auditable artifacts to decision support and efficiency drivers that enhance the human experience and performance.
Human oversight is not a final step layered onto outputs. It is an operating principle embedded into the system itself.
This approach resolves many of the apparent contradictions surrounding AI and compliance. AI can be non-deterministic and still defensible, because responsibility is anchored in who defines its scope, who supervises its use, and who intervenes when thresholds are crossed. Systems no longer need to prove that AI always behaves the same way; they need to demonstrate that AI always behaves within known, governed, and explainable limits.
From Validation to Defensibility in Practice
In practice, AI is not judged solely by what it produces, but by how its use is controlled.
Defensibility emerges when the intended use of AI is explicitly defined and documented, when the boundaries of autonomy are clear and enforced, and when human accountability is preserved at decision points that carry regulatory, medical, or ethical risk. Deviations, uncertainty, and exceptions must trigger structured intervention rather than being absorbed silently by the system, and every action must remain traceable and auditable across the lifecycle of the science, with measurable indicators demonstrating that human oversight is occurring in practice rather than merely on paper. Quality can be measured and biases can be identified.
Under this model, AI becomes governable not because it is frozen into predictable behavior, but because its interaction with human authority is itself predictable, transparent, and accountable. Quality control mechanisms, review coverage metrics, and exception rates become as important as traditional test scripts once were, because they demonstrate that responsibility is exercised continuously rather than episodically.
Traditional validation assumes systems are stable and unchanging. AI systems are neither. Human-in-the-loop and responsibility-by-design acknowledge that AI will evolve, learn, and adapt, while the rules governing its use can remain stable, inspectable, and defensible. This allows organizations to benefit from AI’s performance without sacrificing regulatory control, to evolve systems without constantly revalidating static artifacts, and to demonstrate compliance as a continuous capability rather than a one-time certification exercise.
It also aligns far more naturally with how regulators themselves are beginning to think about AI governance, as reflected in the EU AI Act’s risk-based framework and CIOMS XV’s emphasis on human oversight, transparency, and lifecycle governance.
The New Compliance Maturity Model
In the age of AI, compliance maturity will be measured by how clearly responsibility is designed into how AI systems operate.
Organizations that lead in this area will not be those with the thickest validation binders, but those that can demonstrate clear decision ownership, visible oversight, auditable boundaries, and continuous governance across the AI lifecycle.
This is how trust scales with automation instead of breaking under it.
In the AI era, compliance must become a design discipline rather than a documentation exercise.
Turn document complexity into confident, efficient workflows
Schedule a quick 15-minute call to learn if the Quartica MARS AI Platform can help your team work faster, stay compliant, and reduce risk.
What do you think?