MLSys Case Studies
Why Your $130K ML Pipeline Is Starving 65 Percent of New Merchants [Edition #11]
$9,000 Monthly Vector Index That Failed on 2 Million Documents [Edition #10]
12M Dollars Lost to an AUC Metric That Ignored Probability Calibration [Edition #9]
$4.2M Lost Because of a 48-hour Labeling Loop [Edition #8]
A $1.1M Generative Recommender That Collapsed Into a 2000 Video Loop [Edition #7]
The $22K Neural Search Pipeline That Was Silently 7 Days Behind [Edition #6]
$220K Lost to a Fraud Model That Passed a 0.82 Accuracy Check [Edition #5]
A $27K/Month Ranking System That Silently Buried 45,000 New Listings Daily [Edition #4]
The $5800 FAISS Index That Was Stale for 168 Hours Straight [Edition #3]
800ms Latency Spikes From A $45K Redis Cluster That Looked Healthy [Edition #2]
VectoScale Is Paying $237k/Month to Hide a Bad Architectural Decision [Edition #1]
TokenMixer-Large: Scaling Ranking Models
Embedding Features in Weights to Kill Retrieval Latency
A Blueprint for Scaling Recommender Systems
Decoupling Compute from Sequence Length in CTR Scaling
LinkedIn Semantic Search
Deep Neural Networks for YouTube Recommendations
LinkedIn's MixLM: 10x Faster LLM Ranking via Embedding Injection
xAI Recommendation System Deep Dive
Meta's GEM: Bringing LLM-Scale Architectures to Ads Recommendation
Engineering Airbnb's Embedding-Based Retrieval System
vLLM @ LinkedIn
Deep dive into "Memory for LLMs" architectures
Pinterest recommendation system evolutions through the years
Long sequence for recommendation systems
How LinkedIn built its GenAI platform
Compound AI systems
Near real-time personalization at LinkedIn
TikTok Real Time Recommendation algorithm scales to billions
Uber optimal feature discovery
Netflix ML platform
Reddit's ML Model Deployment and Serving Architecture
Meta AI platform
Doordash monitoring
Uber model deployment