Design an ML Feature Engineering Platform — Practice

Your ML organization has 30+ data scientists who spend 60% of their time on feature engineering — writing ad-hoc SQL queries and Python scripts to compute features from raw data, then copy-pasting feature computation logic between training and serving code. Design a feature engineering platform that makes feature creation, sharing, and serving systematic and reusable.

Scale Requirements

Support 500+ features across all teams, growing 30% quarterly
Feature computation runs on 10TB/day of raw data
Online feature serving: 50,000 QPS with p99 latency under 5ms
Batch feature serving: generate training datasets up to 1TB in under 30 minutes
Features must be discoverable across teams to reduce duplication

Design Requirements

Design the feature computation pipeline — how do data scientists define and schedule features?
Design the feature registry for discovery, documentation, and governance.
Design the dual serving layer — online (low-latency) and offline (high-throughput).
Explain how you ensure feature consistency between training and serving.
Describe how you would drive platform adoption across 30+ data scientists.