Prediction Models — CTR, CVR, and Beyond

Understanding what we predict, why it matters, and how to build effective models.

What We're Predicting and Why

Click-Through Rate (CTR)

The probability a user will click an ad. Critical for:

  • Ranking: Higher CTR ads should rank higher
  • Pricing: CPC auctions need CTR for expected value
  • Quality: Low CTR indicates poor relevance

Conversion Rate (CVR)

The probability a click will result in a conversion. Important for:

  • CPA campaigns: Directly affects advertiser ROI
  • Ranking: For conversion-optimized campaigns
  • Budget efficiency: Better targeting improves delivery

Beyond CTR and CVR

  • Engagement: Time spent, video views, interactions
  • Quality: User satisfaction, ad relevance
  • Long-term value: Lifetime value, retention

Feature Engineering: User, Context, Ad, and Cross Features

User Features

  • Demographics (age, gender, location)
  • Historical behavior (past clicks, purchases, interests)
  • Device and browser information
  • Time-based patterns (time of day, day of week)

Context Features

  • Page content and category
  • Time and date
  • Geographic context
  • Device type and capabilities

Ad Features

  • Creative attributes (image, text, format)
  • Advertiser information
  • Historical performance (CTR, CVR for similar users)
  • Targeting settings

Cross Features

  • User-ad interactions (has user seen this ad before?)
  • User-advertiser history (past interactions with this advertiser)
  • User-category affinity (user's interest in ad category)
  • Contextual matching (ad relevance to page content)

Feature engineering is often more important than model architecture.

Model Architectures: From Logistic Regression to Deep Learning

Logistic Regression

Simple, interpretable baseline. Good for:

  • Understanding feature importance
  • Fast inference
  • Sparse feature spaces

Gradient Boosting (XGBoost, LightGBM)

Strong performance on tabular data:

  • Handles non-linear interactions
  • Feature importance insights
  • Fast training and inference

Deep Learning Models

Wide & Deep

  • Wide component: Memorizes feature interactions
  • Deep component: Generalizes to unseen combinations

DeepFM

Factorization machines with deep learning:

  • Captures low and high-order feature interactions
  • Efficient for sparse features

Transformer-based Models

For sequential and contextual understanding:

  • User behavior sequences
  • Ad creative understanding
  • Cross-modal features

Calibration: Why It Matters More Than Accuracy

The Problem

Models can have good ranking (AUC) but poor calibration. For auctions, we need accurate probability estimates, not just relative rankings.

Why Calibration Matters

  • Auction pricing: Incorrect probabilities lead to wrong prices
  • Budget planning: Advertisers need accurate conversion estimates
  • Revenue optimization: Platform needs accurate expected value

Calibration Techniques

  • Platt scaling: Logistic regression on model outputs
  • Isotonic regression: Non-parametric calibration
  • Temperature scaling: Single parameter adjustment
  • Calibrated training: Incorporate calibration into loss function

Multi-Task Learning: Clicks, Conversions, and Engagement Together

Benefits

  • Shared representations: Learn common patterns across tasks
  • Data efficiency: Leverage signals from related tasks
  • Consistency: Predictions align across tasks

Architecture

  • Shared bottom: Common feature processing
  • Task-specific towers: Separate heads for CTR, CVR, engagement
  • Loss weighting: Balance importance of different tasks

Challenges

  • Task imbalance: Clicks are much more common than conversions
  • Label quality: Different tasks have different label reliability
  • Optimization: Balancing multiple objectives

Content to be expanded...