Tech Abstractions
MLOps·Debugging & Incident Triage·Hard

Debug a Fraud Model with Segment-Level Degradation

Asked at Stripe, PayPal, Affirm

You join a team that has a production fraud model deployed for 3 years. Current offline AUC is 0.94. But over the past 6 months, false positive rates have increased 40% on a specific user segment — new signups from mobile — while overall model metrics look fine.

Your manager wants you to "tune the threshold." Walk through what you would actually investigate before changing anything.

  • Why threshold tuning is unlikely to fix the root cause
  • The three most plausible explanations for segment-specific degradation
  • What data you would pull first
  • What you would escalate vs. fix yourself

Follow-up ladder

  1. Rung 1: You pull per-segment data and find the false positive rate on mobile signups started increasing 8 months ago — 2 months before the overall AUC started declining (barely). What does this timing tell you?
  2. Rung 2: You discover that a new mobile app version changed how device fingerprints are computed 8 months ago. The model uses device fingerprinting as a feature. How does this change your diagnosis and the fix?
  3. Rung 3: The data team says fixing the feature pipeline will take 6 weeks. The fraud team is under pressure to reduce false positives now. What short-term mitigation is reasonable without making the underlying problem worse?

Your Answer

Unlock AI-powered scoring, all questions, and progress tracking.

Study the related chapter →