Tech Abstractions
MLOps·ML System Design·Hard

Frame a Real-Time Content Moderation System Under Hard Constraints

Asked at Meta, YouTube, Discord

A platform needs a real-time content moderation system with the following requirements: handle 100,000 posts per minute, make a blocking decision in under 50ms, and achieve a false positive rate of less than 0.1% on the post stream (posts wrongly blocked).

Before writing any model code, walk through how you would frame this problem.

  • Which constraint is binding — and what does that mean for architecture?
  • What does the false positive budget actually mean in terms of user impact?
  • How would you structure the system — and what would you explicitly not build first?

Follow-up ladder

  1. Rung 1: You propose a multi-stage pipeline. The policy team says every stage must achieve less than 0.1% FPR individually, not just the overall system. Does this change your architecture? How do you explain the difference between per-stage and system-level FPR to a non-technical stakeholder?
  2. Rung 2: The rules-based first stage catches 90% of obvious violations at near-zero latency. The remaining 10% — ambiguous cases — go to the ML stage. The ML stage achieves 95% precision on these cases. What is the effective system precision, and what does the human review queue look like?
  3. Rung 3: Six months after launch, a new type of violation emerges — coordinated inauthentic behavior that looks like normal content individually but is harmful in aggregate. Your current system cannot detect this. How does the framing need to change?

Your Answer

Unlock AI-powered scoring, all questions, and progress tracking.

Study the related chapter →