ai-risk-management
  • Risk Management
  • 27th May 2026
  • 1 min read

AI Risk Management: Identify, Assess and Control AI Risks

Gabriel Few-Wiegratz
  • Written by
Gabriel Few-Wiegratz
View my profile on
In Short..
  • AI risk requires its own governance framework. Model risk, data risk, third-party AI risk, deployment risk, and reputational risk introduce failure modes that traditional ERM frameworks were not designed to address.
  • Risk appetite must be specific and measurable. Organisations need clear rules on prohibited AI uses, approved use cases, and acceptable thresholds for model performance, fairness, explainability, and data quality.
  • AI risk changes over time. Model drift, evolving datasets, and shifting deployment contexts mean continuous monitoring is essential for maintaining an accurate risk profile.
  • Third-party AI is often the biggest blind spot. Vendor due diligence must extend beyond questionnaires to include contractual protections, transparency requirements, audit rights, and AI-specific accountability.

Effective AI risk management is not about identifying risks once; it is about building controls that operate continuously. The strongest programmes integrate AI risk into enterprise risk management, define measurable controls, and establish clear escalation paths when thresholds are breached. A documented control with monitoring, reporting, and response procedures is a genuine risk management mechanism. A policy statement without evidence, metrics, or ownership is not.

Expert View

undefined-May-25-2026-06-11-05-9774-PM

 

Matt Davies

Chief Product Officer, SureCloud

LinkedIn



What our experts say about the vendor landscape gap in AI risk frameworks

 

“The weakest point in most AI risk frameworks is the vendor landscape. A questionnaire at onboarding tells you what the vendor chose to disclose against questions you formed before you understood the use case. Effective due diligence is risk-tiered and use-case specific: what decisions does this AI make or influence, who is affected, and what evidence can the vendor provide about model performance across those groups? Where vendors fall short on specificity, document the gap and make closing it a condition of ongoing use.”

Key Facts

  1. EU AI Act Article 9 requires providers and deployers of high-risk AI systems to establish, implement, document, and maintain a risk management system as a continuous iterative process throughout the system lifecycle. The system must identify, analyse, and evaluate known and reasonably foreseeable risks to health, safety, or fundamental rights. High-risk obligations apply from 2 August 2026.
  2. Bank of England SS1/23 is the PRA supervisory statement on model risk management for banks. It sets expectations for model identification, governance, development, validation, and ongoing monitoring across the model lifecycle. Firms subject to SS1/23 must apply its principles to all models in scope, including AI models.
  3. ISO 42001:2023 Clause 6.1 requires organisations to identify AI-specific risks and opportunities, assess those risks against defined criteria, and document a treatment plan for risks above their defined tolerance. It provides the most complete structured framework for integrating AI risk into enterprise risk management.
  4. UK GDPR requires a Data Protection Impact Assessment where processing is likely to result in high risk to individuals' rights and freedoms. AI systems that involve automated decision-making, process sensitive data at scale, or use technology in ways individuals would not reasonably anticipate may trigger this obligation. The DPIA should be completed before deployment.
  5. FCA Consumer Duty requires firms to demonstrate that AI-assisted decisions deliver good outcomes for retail customers. Where AI contributes to poor outcomes, firms face regulatory action regardless of whether the model was internally built or vendor-supplied.

AI-Specific Risk Categories

A working taxonomy is the foundation of AI risk management. The categories below cover the failure modes risk managers most commonly encounter and that are most underrepresented in existing frameworks.

 

Model Risk

Model risk is the risk that an AI system produces outputs that are incorrect, biased, or otherwise flawed, and that those outputs influence decisions. It covers four primary failure modes.

  1. Model error: the model is poorly designed, trained on unrepresentative data, or optimised for the wrong objective
  2. Overfitting: the model performs well on training data but poorly on real-world inputs
  3. Model drift: the model's performance degrades over time as the real-world data distribution diverges from training data
  4. Fairness failure: the model produces systematically different outcomes for different demographic groups in ways that are discriminatory or unjust

Model risk applies equally to vendor-supplied models where the organisation has limited visibility into design or training methodology.

 

Model risk is sometimes treated as a subset of operational risk. The assessment and control approach differs enough to warrant treating it as a distinct category, particularly given the forward-looking validation requirements and the ongoing monitoring obligation.

 

Data Risk

AI systems are only as good as the data they are trained and operated on. Data risk covers the full pipeline from input to the model.

  1. Training data quality: biased, incomplete, or outdated training data produces unreliable models
  2. Data poisoning: adversarial manipulation of training or operational data to influence model outputs, relevant to security-sensitive applications
  3. Data leakage: sensitive data used in model training that can be extracted or inferred from model outputs
  4. Data pipeline failures: breakdowns in data flows that cause models to operate on stale, corrupted, or incomplete inputs at inference time

Data risk connects to your data governance framework. AI-specific risks (training data quality and data poisoning in particular) often sit outside standard data quality or information security controls and require dedicated assessment.

 

Third-Party AI Risk

In most organisations, the majority of AI risk comes from vendor-supplied AI capabilities embedded in the tools the business already uses: CRM systems with AI-powered lead scoring, finance platforms with AI-driven forecasting, HR tools with AI-assisted screening.

  1. Opacity: limited visibility into how vendor models work, how they are trained, and how they perform across demographic groups
  2. Supply chain risk: if a vendor's AI system is compromised or fails, the processes that depend on it fail with it
  3. Contract risk: vendor contracts that do not adequately allocate responsibility for AI failures or provide audit access and transparency rights
  4. Dependency risk: over-reliance on vendor AI capabilities that could be withdrawn, changed, or degraded without adequate notice

Deployment Risk

A model that performs well in testing may perform differently in production. Deployment risk covers the gap between controlled validation and real-world operation.

  1. Distribution shift: real-world inputs that differ from test conditions in ways not anticipated during validation
  2. Integration failures: problems at the boundary between the AI system and the processes or systems it is embedded in
  3. Misuse: users applying the AI system in ways it was not designed for, or acting on AI outputs without appropriate human judgement
  4. Scope creep: a system approved for one use case being extended to another without adequate re-evaluation

 

Reputational Risk from AI Failures

AI failures that reach the public (biased decisions, discriminatory outputs, privacy breaches) carry reputational consequences that are often disproportionate to the underlying harm. AI failures are legible as discrete events in a way that incremental process failures rarely are. A case of algorithmic discrimination has a name, a victim, and a public record.

 

Reputational risk from AI is particularly acute in consumer-facing and regulated industries. Organisations operating under Consumer Duty face both regulatory exposure and significant brand damage if AI-assisted decisions are seen to harm customers.

 

Integrating AI Risk into Your ERM Framework

AI risk managed in a separate silo creates a parallel process that bypasses board-level risk oversight. Integration into ERM means risk appetite, escalation, and reporting for AI follow the same pathways as other enterprise risks.

 

Existing ERM taxonomies, risk register templates, and assessment methodologies require adaptation before they work for AI. Integration means adapting those frameworks rather than adding AI as a new row.

 

Adapting the Risk Register for AI

A standard risk register entry captures risk description, likelihood, impact, controls, risk owner, and residual risk. For AI risks, several additional fields make the entry actionable.

 

Field

Why It Matters for AI Risk

AI system identifier

Links the risk to a specific system in the AI register; essential when multiple systems share similar risk categories

Risk tier

Connects to the governance tier of the AI system; determines the level of oversight and review frequency required

Model version

Risk profile may change as models are updated; the register entry needs to reflect which version it applies to

Monitoring mechanism

AI risks require continuous monitoring; specify what is in place and at what frequency

Human oversight level

Whether human review is in place and at what threshold; affects residual risk calculation

Regulatory exposure

Which regulations create liability for this risk category; affects prioritisation and escalation pathways

 

Risk Appetite for AI

Risk appetite for AI is an area where most organisations have said something without really committing to anything. A board-level statement that "we will use AI responsibly" sets a direction but leaves every implementation question unanswered.

 

Operationally useful risk appetite for AI answers three questions.

  1. Which AI uses are prohibited outright, regardless of the business case?
  2. Which AI uses are permitted subject to specific conditions and controls?
  3. What level of model risk, fairness shortfall, or deployment risk is acceptable in the operation of approved AI systems?

The third question is the hardest and the most important. It forces a conversation about quantified thresholds: how much performance degradation triggers a review, what fairness metric shortfall triggers suspension, what level of data quality failure warrants action. Most organisations have not had that conversation yet.

 

A risk appetite statement that says "AI risk must be managed appropriately" gives every team complete discretion in interpretation. It sets no escalation threshold, no performance floor, and no cross-unit consistency. The statement creates an appearance of governance without the substance.

AI Risk Assessment in Practice

Risk assessment for AI systems needs to happen at three stages: before deployment as part of the approval process, at defined intervals post-deployment, and when triggering events occur.

 

Pre-deployment Assessment

Before an AI system is approved for use, a risk assessment should cover the following questions.

  1. What decisions does this system make or influence, and what are the consequences of a wrong decision?
  2. Who is affected (employees, customers, third parties) and in what ways?
  3. What data does it use, and what are the data quality and data protection risks?
  4. What are the known limitations of the model, and what failure modes are most likely?
  5. What controls are in place, and are they adequate given the risk tier?
  6. Is there a human oversight mechanism for high-stakes decisions?

For high-risk systems, a Data Protection Impact Assessment is likely required under GDPR, and should be conducted in parallel with the AI risk assessment rather than as a separate exercise.

 

Ongoing Monitoring as a Risk Control

The key difference between AI risk assessment and traditional risk assessment is that AI risk is dynamic. The risk profile of a system changes over time, and point-in-time assessments capture only a moment in that evolution. Ongoing monitoring is both a governance requirement and a risk control.

 

Monitoring that functions as a real risk control has four characteristics.

  1. Defined in terms of specific metrics with documented thresholds: for example, if fairness metric X falls below Y, the system is reviewed within Z days
  2. Conducted at a frequency appropriate to how quickly the system and its data environment change
  3. Routes findings to someone with the authority and context to act, with a log of the action taken
  4. Has a documented escalation pathway for threshold breaches

Common monitoring failures include periodic reports that are generated but never reviewed, dashboards that flag issues with no defined response process, and review cycles set annually regardless of how dynamic the system is.

Controls That Actually Reduce AI Risk

The distinction between controls that reduce actual AI risk and controls that demonstrate compliance matters for both risk outcomes and the credibility of your governance posture. A regulator able to identify the difference will scrutinise the latter accordingly.

 

Control Type

Compliance Theatre Version

Effective Version

Model validation

One-time validation at deployment against a benchmark dataset

Ongoing validation against updated holdout data; periodic re-validation triggered by drift metrics

Fairness monitoring

"Fairness is monitored" in policy documentation

Defined fairness metrics, documented thresholds, quarterly reporting to risk committee, breach response process

Human oversight

Policy states "human review required for consequential decisions"

Defined review criteria, minimum review standard, mandatory logging of review outcome, override tracking

Third-party AI due diligence

Vendor questionnaire on AI usage at onboarding

Risk-tiered due diligence based on AI use case, contractual audit rights, ongoing performance monitoring, periodic re-assessment

AI incident response

AI added to IT incident response scope

Dedicated AI incident classification, response protocol for gradual performance failures as well as discrete incidents, regulatory notification triggers

Risk appetite

Board statement on responsible AI

Quantified thresholds for model performance, fairness, and data quality; approved by risk committee and embedded in monitoring triggers

 

The pattern is consistent: effective controls are specific, measurable, and tied to defined responses. Compliance theatre controls are stated in policy without operational specificity.

Integrate AI risk into your enterprise risk framework

SureCloud's compliance management platform gives risk teams the register structure, risk-tiered assessment workflows, and monitoring trigger configuration to manage AI risk as part of their ERM programme. Gracie AI Agents with Personas and Skills runs continuous monitoring against defined thresholds in your AI risk register and routes breaches to the owners with authority to act. Audit preparation time reduced by 75%.For the governance and audit trail requirements behind AI risk management, read: Auditable AI Decisions: How to Evidence Governance Under Regulatory Scrutiny.For the foundational framework, read: AI Governance Isn't Optional: How to Build an Auditable, Defensible Framework.Request a demo to see AI risk management in practice.
Related articles:
  • Third-Party Risk

What Is Third-Party Risk Management? TPRM Explained

  • Compliance Management
  • ISO 42001

EU vs UK AI Regulation: What It Means for Governance & Risk

  • CCM
  • Risk Management

From Manual to Measurable: SureCloud’s Continuous Control Monitoring at Gartner Security & Risk Management Summit 2025

Share this article

FAQ’s

How does AI risk management relate to ISO 42001?

ISO 42001 is the international standard for AI management systems, providing a structured framework that encompasses risk management. The risk categories, assessment approach, and control framework in this article map directly to the standard's clauses for AI risk assessment and treatment.
If your organisation is pursuing ISO 42001 certification, the taxonomy and methodology described here provides a practical starting point. See the SureCloud ISO 42001 guide for detail on the certification pathway and clause-by-clause requirements.

We have an existing model risk management framework from our financial services regulator. How does this connect?

Model risk management (MRM) frameworks, such as those required under Bank of England SS1/23, address a subset of AI risk focused specifically on model design, validation, and performance. AI risk management is broader: it includes data risk, third-party risk, deployment risk, and reputational risk that sit outside most MRM frameworks.
A well-structured MRM framework covers the model risk category well; the gaps are most commonly in data risk governance, third-party AI oversight, and the integration of AI risk into broader ERM. Use the MRM framework as the foundation for model risk and build outward from there.

How do we prioritise AI risk assessment when we have a large and varied AI estate?

Risk tier classification is the mechanism. Start with your highest-risk AI systems: those involved in consequential decisions, those operating in regulated contexts, and those with significant third-party or data dependencies.
Most AI use in organisations falls into lower-risk categories where a lighter assessment is proportionate. Directing the same level of rigour at a meeting transcription tool as at an AI-assisted credit decision engine wastes governance resource where it matters least. Risk tier determines the investment in assessment.

How do we set quantified risk appetite thresholds when we have limited experience with AI performance data?

Start with the question of harm rather than the question of metrics. For a given AI system, define the worst acceptable outcome and work backward: what model performance level, fairness metric, or data quality standard would have to fail for that outcome to become likely? Set your monitoring threshold above that point and your review trigger where you still have time to intervene.
The thresholds will need adjustment as you gather operational data, and that adjustment process is itself evidence of a functioning risk management programme. Starting with conservative thresholds and refining them is more defensible than waiting until you have perfect data before setting any.