AI Model Bias Audit Implementation

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Model Bias Audit Implementation
Medium
~1-2 weeks
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1214
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

AI Model Bias Audit Implementation

The model shows aggregate accuracy of 0.89 — sounds good. But when you break metrics down by subgroups, it turns out: for one demographic group, precision drops to 0.71, and recall to 0.58. This isn't just "fairness" — it's an operational risk: the model systematically makes mistakes in a specific segment, and if that segment is important to the business or protected by legislation, the problem is critical.

Bias audit is a structured process for finding such gaps and their sources.

What is Bias Audit Technically

Bias audit is measuring model metrics across subgroups, comparing these metrics, statistically verifying gaps, and finding sources in data, features, or annotation process. It's not a one-time event — it's a process embedded in the ML lifecycle.

Audit standards are built on several questions:

1. Which groups to analyze? Protected characteristics under legislation (gender, age, nationality, religion, etc.) — mandatory minimum. Additionally — business-relevant segments (region, customer type, acquisition channel).

2. Which fairness definition to choose? Demographic parity, equalized odds, calibration within groups — mathematically incompatible. Choice depends on the use case.

3. What gap counts as significant? Statistical significance (p < 0.05 with multiple comparison correction) + practical significance (effect size). 2% difference on a 50k sample is statistically significant but not necessarily operationally.

Audit Methodology

Stage 1 — Data Audit

Before model training. Analyze the training dataset:

  • Distribution across subgroups — underrepresentation of one group will worsen metrics specifically for it
  • Feature correlation with protected attributes (proxy features)
  • Annotation quality across subgroups (inter-annotator agreement via Cohen's kappa separately by group)
  • Temporal bias — data from different time periods may contain different patterns for different groups

Tools: pandas profiling, Ydata-profiling, custom scripts for correlation matrix.

Stage 2 — Model Performance Audit

After training. Standard metric set for each subgroup:

from fairlearn.metrics import MetricFrame
from sklearn.metrics import accuracy_score, precision_score, recall_score

metrics = {
    'accuracy': accuracy_score,
    'precision': precision_score,
    'recall': recall_score,
    'false_positive_rate': lambda y_true, y_pred:
        ((y_pred == 1) & (y_true == 0)).sum() / (y_true == 0).sum()
}

mf = MetricFrame(
    metrics=metrics,
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=sensitive_features
)

print(mf.by_group)
print(mf.difference())  # Max difference between groups
print(mf.ratio())       # Min/max ratio between groups

Target thresholds (EU AI Act guidelines for high-risk systems):

  • Demographic parity difference < 0.1
  • Equalized odds difference < 0.1
  • False positive rate ratio: 0.8 – 1.25 (80% rule, EEOC standard)

Stage 3 — Root Cause Analysis

If a gap is found — search for its source. Four main vectors:

Representation bias: subgroup comprises 3% of dataset but 15% of real requests. Model hasn't seen enough examples. Solution: oversampling (SMOTE, ADASYN), class-weighted loss, focal loss.

Feature bias: proxy feature. Postal code → ethnic group. Transaction frequency → income level → demographics. Correlation analysis of all features with protected attributes. Remove proxies or use adversarial debiasing.

Label bias: annotators labeled differently for different groups. Inter-annotator agreement by subgroup. Re-label problematic segments.

Threshold bias: single classification threshold is unfair at different base rates. Threshold optimization separately by group (Fairlearn ThresholdOptimizer).

Practical Case

Client is an HR-tech company, resume scoring model (CatBoost, 85 features). Internal audit found: recall for candidates with foreign names is 17 percentage points lower than for others.

Root cause analysis: the "university name" feature had high weight and was encoded via target encoding — universities from certain countries systematically received low encoded values due to historical underrepresentation of hired candidates. Proxy discrimination through educational institution.

Solution:

  • Replaced target encoding with neutral frequency encoding for this feature
  • Added adversarial head to architecture (additional "foreign/not foreign name" classifier with gradient reversal)
  • Threshold optimization via Fairlearn to equalize recall

Recall gap decreased from 17 pp to 4 pp with AUC loss = 0.008.

Documentation and Reporting

Audit results are formatted in standardized form. Minimum:

Model Card (Mitchell et al., 2019) — model description, training data, metrics by subgroup, known limitations.

Algorithmic Impact Assessment — analysis of potential harms, mitigations, residual risk.

For EU AI Act (high-risk systems) — mandatory technical documentation per Annex IV.

Timeline and Process

Audit of existing model — 2-3 weeks: collect subgroup data, measure metrics, root cause analysis, report with recommendations.

Mitigation + re-audit — another 3-5 weeks depending on bias source complexity.

Embedded process — bias audit as part of CI/CD: automatic Fairlearn metrics check on each retrain with deployment blocking on threshold violation. Setup takes 1-2 weeks.