MONITORING AND DRIFT PLAN¶

What this section does:

After deployment, is the model still valid? This section documents what a responsible monitoring programme for this specific model would look like. We cannot demonstrate live drift detection on a static dataset, but we document the required framework clearly.

This is required under BoG CISD 2026 §100(6), which mandates continuous monitoring of AI models in production.

Why monitoring matters particularly for this model:

The bias we found is not a frozen snapshot. It compounds over time. When the model misses Low-Balance fraud in production, those missed cases get recorded as legitimate transactions in the system logs. When the model is retrained on updated data, it learns from its own past mistakes. The gap does not self-correct. It grows. This audit found a 33.3% miss rate for Low-Balance users versus 0.3% for High-Balance users. Left unmonitored, that gap will widen with every retraining cycle.

A monitoring programme must therefore track not just overall performance, but per-group performance. An aggregate performance metric that looks stable can conceal a growing fairness gap if Low-Balance performance is deteriorating while High-Balance performance stays strong.


[Monitoring framework]¶

Required Monitoring Framework¶

What to monitor:

Signal Frequency Alert Threshold
Overall fraud recall Weekly Below 0.95 triggers review
Low-Balance TPR Weekly Current baseline 0.6667. Any drop below 0.60 triggers immediate review
EOD by balance tier Monthly Above 0.10 triggers mandatory review
SPD by balance tier Monthly Above 0.10 triggers review
Feature distribution drift (amount, balance columns) Monthly KS test above 0.10 on any feature triggers retraining review
False positive rate by group Weekly Sustained rise in any group triggers review

What constitutes a drift event:

Data drift occurs when the statistical distribution of input features changes significantly from the training data distribution. For a fraud detection model, the most important drift signals are:

Changes in the amount distribution. If average transaction sizes shift significantly, the model's balance ratio features will behave differently than during training.

Changes in fraud type prevalence. If fraudsters shift from TRANSFER to other transaction types, the model trained on the current distribution may miss the new patterns entirely.

Changes in the balance tier composition of fraud. If Low-Balance fraud increases as a share of total fraud — which may happen as awareness of high-balance detection improves among fraudsters — the existing model will be even less adequate than it currently is.

Retraining triggers:

Scheduled retraining every 90 days minimum, per BoG CISD 2026 Annexure E §g(ii).

Emergency retraining triggered by: EOD exceeding 0.10 on any fairness check, overall recall dropping below 0.90, or a confirmed fraud incident where model failure contributed to customer harm.

Fairness monitoring is not optional:

Under BoG CISD 2026 Annexure E §l(i), fairness metrics must be defined, monitored, and reported to the AI Governance Committee. Monitoring only aggregate performance while ignoring per-group metrics is not compliant with this requirement.

Regulation Provision Status
BoG CISD 2026, §100(6) Continuous monitoring of AI models in production Framework documented
BoG CISD 2026, Annexure E §g(ii) Version control and retraining management Requirements defined
BoG CISD 2026, Annexure E §l(i) Fairness metrics monitored and reported Protocol established
NIST AI RMF 1.0, MEASURE 3.1 Track existing, unanticipated, and emergent risks Approach documented


In [ ]: