Chapter 2.4

Power Laws in Search

Query distribution follows a power law: a tiny fraction of queries generate most traffic.

The Power Law Distribution

This curve (Zipf's Law) represents one of the most fundamental truths in search. The "Head" (red) represents safe, frequent queries. The "Tail" (green) is where the complexityand often the high-value intentlives.

Head Torso Tail

Adaptive Optimization Strategy

We can't treat all queries the same. In code, we apply different time-to-live (TTL) and processing depths based on query frequency.

optimization.py

def get_optimization_config(query: str, frequency: int):

# HEAD Queries (Top 1%)

# Strategy: Aggressive Caching, Pre-computed Results

if frequency > 100000:

return {

"cache_ttl": 3600, # Cache for 1 hour

"enable_deep_learning_rerank": False, # Too slow, use pre-compute

"use_approximate_knn": False # Need exact top results

}

# TAIL Queries (Bottom 90%)

# Strategy: Expensive Compute, No Caching

elif frequency < 100:

return {

"cache_ttl": 60, # Cache for 1 min (rarely hit again)

"enable_deep_learning_rerank": True, # Need semantics to understand

"use_approximate_knn": True, # Approximate is fine for recall

"query_expansion": "aggressive" # Try hard to find matches

}

# TORSO

return {"cache_ttl": 600, "enable_deep_learning_rerank": True}

Traffic Distribution

A visual breakdown of how a small percentage of distinct queries accounts for the massive majority of total search volume.

% of Total Traffic

Head Queries

1% of unique

30%

of traffic

Torso Queries

10% of unique

30%

of traffic

Tail Queries

89% of unique

40%

of traffic

The Scalability Paradox

Head Queries = CPU Problem

High QPS (Queries Per Second). Serving "iphone" 10,000 times/sec requires massive compute if not cached.

Cache Hit Rate99.9%
Latency Target< 10ms

Tail Queries = IO Problem

Huge Index. Serving "1994 toyota corolla alternator bolt size" requires scanning massive indices on disk.

Cache Hit Rate< 5%
Latency Target< 200ms

Strategies by Segment

Head Strategy

• Manual tuning & curation
• Heavy caching (5 min TTL)
• Dedicated A/B testing
• Query-specific rules

Torso Strategy

• Category-level rules
• Template matching
• Click models (aggregated)
• Facet optimization

Tail Strategy

• Semantic/vector search
• Query relaxation
• Fallback strategies
• LLM rewriting

Key Takeaways

Power Law

Search follows a power law (Zipf's Law). A few queries (Head) drive massive volume.

Head Strategy

Memory-bound (CPU). Optimize with aggressive caching and manual curation.

Tail Strategy

IO-bound (Disk). Optimize with semantic search and query expansion (recall).

Scalability Paradox

You can't use the same architecture for both. Adaptive pipelines are required.

Intent vs Tokens Next: Understanding Pipeline