Chapter 2.6
Query Rewriting & Expansion
Users type imperfect queries. Rewriting and expansion bridge the gap between query and corpus.
The Expansion Trade-off
This graph visualizes the classic trade-off in query rewriting: as you expand queries more aggressively (moving left to right), you find more results (Recall increases), but the relevance of those results typically drops (Precision decreases).
Precision vs Recall by Expansion Level
Expansion is a balancing act. The crossover point is where you maximize finding relevant items without flooding the user with junk.
Key Insight
The sweet spot is usually "Synonyms + High Confidence Semantic". Going broader often hurts user experience more than it helps.
Rewriting Logic
How query rewriting looks in code. This example shows both simple synonym expansion and LLM-based rewriting.
The Expansion Spectrum
High Precision
Many zero results
Synonyms only
Safe expansion
+ Related terms
Balanced
High Recall
Some noise
Query Rewriting
Structured Query Conversion
Input:
Output:
{
"text_query": "running shoes",
"filters": {
"color": "blue",
"brand": "Nike",
"size": "10",
"price": {"max": 100}
}
}Entity → Filter
Extract entities, convert to filters
Template Matching
Common patterns to structure
LLM Rewriting
For natural language queries
Industry Case Studies
Amazon
Multi-layer approach:
- 1. Dictionary (60%, 98% precision)
- 2. Templates (20%, 95% precision)
- 3. ML Model (15%, 88% precision)
- 4. LLM fallback (5%, cached)
Key learnings:
- • Over-expansion hurts more than under
- • Show original results first
- • Mark expanded clearly
Spotify
Mood expansion:
- • "sad" → audio features
- • Personalized by history
- • Genre diversification
Adaptive Expansion
Adjust expansion based on result count:
Results: Many
Don't expand (preserve precision)
Results: Good
Light expansion (synonyms only)
Results: Zero
Heavy expansion + relax constraints
Key Takeaways
Trade-off
Expansion increases recall (finding more) but hurts precision (relevance).
Structured Rewriting
Convert unstructured text to structured filters (e.g., "under $100" -> price < 100).
Constraints
Never expand named entities (Brands) or negations. It destroys trust.
Adaptive Strategy
Expand aggressively only when you have low result counts (Zero Results).