Chapter 0.5

Real-World Search vs LeetCode

Why production search is nothing like the problems you solved in interviews.

The Five Dimensions of Difference

Dimension 1: Data

Aspect	LeetCode / Academia	Production
Size	Fits in memory	Petabytes across clusters
Format	Clean JSON/array	Dirty HTML, PDFs, inconsistent
Updates	Static dataset	Real-time streaming, 1000s/sec
Quality	Perfect	Missing fields, duplicates, spam

Dimension 2: Correctness

Dimension 3: Latency

Real constraint: You have 50ms total.

Your fancy BERT reranker that takes 200ms? Unusable without distillation/caching.

Dimension 4: Failure Modes

Aspect	Academic	Production
Node failure	Doesn't happen	Daily occurrence
Partial results	Not possible	"4 of 5 shards, return best effort"
Degradation	Binary (works/broken)	Graceful (disable features under load)

Dimension 5: The Feedback Loop

Academic

Query → Retrieve → Rank → Evaluate (end)

Production

Query → Retrieve → Rank → Serve → User interacts → Log → Retrain → Deploy → Repeat

This loop is the moat. Teams with good feedback loops improve continuously.

LeetCode Mindset

"Find the optimal solution."

Production Mindset

"Find a good-enough solution that works 99.9% of the time, degrades gracefully the other 0.1%, costs $X/month, and can be improved incrementally."

This guide teaches the second mindset.