MLOps·ML System Design·Easy
Design an ML Feature Engineering Platform
Asked at Airbnb, Uber, Netflix
Your ML organization has 30+ data scientists who spend 60% of their time on feature engineering — writing ad-hoc SQL queries and Python scripts to compute features from raw data, then copy-pasting feature computation logic between training and serving code. Design a feature engineering platform that makes feature creation, sharing, and serving systematic and reusable.
Scale Requirements
- Support 500+ features across all teams, growing 30% quarterly
- Feature computation runs on 10TB/day of raw data
- Online feature serving: 50,000 QPS with p99 latency under 5ms
- Batch feature serving: generate training datasets up to 1TB in under 30 minutes
- Features must be discoverable across teams to reduce duplication
Design Requirements
- Design the feature computation pipeline — how do data scientists define and schedule features?
- Design the feature registry for discovery, documentation, and governance.
- Design the dual serving layer — online (low-latency) and offline (high-throughput).
- Explain how you ensure feature consistency between training and serving.
- Describe how you would drive platform adoption across 30+ data scientists.
Your Answer
Unlock AI-powered scoring, all questions, and progress tracking.