case study 2024-10-10 11 min read

Uber's Optimal Feature Discovery for Machine Learning

How Uber automatically discovers and ranks the most important features for their ML models at scale.

Uber feature engineering ML automation optimization

Introduction

Feature engineering often determines model success, but manual feature discovery doesn't scale. Uber's optimal feature discovery system automates this process, enabling faster model development across the company.

The Problem

Manual Feature Engineering

Traditional approach:

  1. Domain experts brainstorm features
  2. Engineers implement features
  3. Data scientists evaluate importance
  4. Iterate slowly

Challenges:

  • Time-consuming
  • Limited by human creativity
  • Doesn't scale across use cases

Uber's Solution

Automated Feature Discovery

Raw Data -> Feature Generators -> Candidate Features -> Evaluator -> Top Features
                |                      |                    |
          (automated)           (thousands)           (model-based)

Feature Generators

Types of automated transformations:

  • Aggregations: sum, mean, count, percentiles
  • Time windows: 1h, 1d, 7d, 30d
  • Categorical: encodings, combinations
  • Interactions: products, ratios

Technical Implementation

Feature Template System

# Example feature template
template = FeatureTemplate(
    entity="driver",
    source="trips",
    aggregations=["count", "mean", "sum"],
    columns=["fare", "distance", "rating"],
    windows=["1d", "7d", "30d"]
)
# Generates: driver_trips_fare_count_1d, driver_trips_fare_mean_7d, etc.

Importance Ranking

Methods used:

  • SHAP values: Model-agnostic importance
  • Permutation importance: Direct impact measurement
  • Forward selection: Greedy feature addition

Scalability

  • Distributed computation on Spark
  • Feature caching for reuse
  • Incremental updates for new data

Use Cases at Uber

ETA Prediction

Discovered features:

  • Route traffic patterns by time
  • Driver behavior features
  • Weather interactions

Fraud Detection

Discovered features:

  • Transaction velocity features
  • Device fingerprint aggregations
  • Cross-entity connections

Results

  • X% reduction in feature engineering time
  • Y% improvement in model performance
  • Hundreds of models using discovered features

Best Practices

  1. Start with a rich raw feature set
  2. Invest in feature computation infrastructure
  3. Human oversight remains important
  4. Document discovered features

Learn more about feature engineering in our Recommendation Systems at Scale course.

Want to Go Deeper?

This article is part of our comprehensive curriculum on building ML systems at scale. Explore our full courses for hands-on learning.