Chapter 5.4Retrieval Architecture

Filters, Facets & Constraints

Retrieval finds candidates. Filters narrow them down. Facets provide navigation.

This chapter covers the data structures and algorithms that power the "refine your search" experience. Without efficient filtering, every search would require scoring millions of documents.

Filter Speed

10ms

Filter 1M docs to 5K candidates using cached bitsets.

Facet Counts

~100

Typical facet cardinality for navigable categories.

Cache Hit Rate

95%+

Filter cache hit rate in production systems.

1. The Fundamental Split

Before diving into implementation, understand the three distinct concepts. They look similar in a UI but have very different performance characteristics. A well-designed search experience must handle all three correctly, or risk showing incorrect counts, slow responses, or confusing dead-ends.

Many junior engineers conflate these terms, leading to bugs like "why do facet counts not match results?" or "why is my filter so slow?" Each serves a distinct purpose: Filters exclude documents, Facets count values across results, and Constraints are the user-facing specifications that drive both.

Filters

Binary exclusion: "Is this document in or out?" Affects which documents appear.

Example

in_stock: true

CacheableNo Score

Facets

Counts across results: "How many documents have each value?" Powers the sidebar counts.

Example

Brand: Apple (50), Samsung (30)

CPU IntensiveUses DocValues

Constraints

The specification that drives both: the actual parameter like "price < 100".

Example

price: [0 TO 100]

User InputDrives Both

2. Core Data Structures

Filters and facets rely on different data structures optimized for their specific access patterns. Understanding these is key to performance tuning.

2.1 Inverted Index for Filtering

The inverted index maps terms to document IDs. Filtering is fundamentally set intersection. If you haven't already, review Chapter 3.2 for how posting lists work.

"Blue"→[DocID: 1, 5, 12, 99, 103, ...]

"Red"→[DocID: 2, 5, 14, 99, ...]

Query: Blue AND Red

Intersect →[5, 99]

Example: E-commerce Filter

User selects "Blue" + "Size: Large" + "In Stock". The engine retrieves posting lists for each term and intersects them. With 1M documents, if each filter has 100K matches, the intersection might yield only 5K candidates.

2.2 Roaring Bitmaps

The industry standard for compressed bitsets used in filter caching. When you apply a filter like in_stock: true, the result is a set of matching document IDs—potentially millions. Storing this naively (one bit per doc) wastes memory for sparse results.

Roaring Bitmaps solve this by dividing the 32-bit docID space into 65,536 "containers" of 65,536 IDs each. Each container independently chooses the most efficient representation based on its density. This makes Roaring 10-100× faster than naive bitsets for typical search workloads where result sets are sparse.

Array Container

Used when ≤ 4096 values

[12, 45, 78, 234, 567]

Sorted u16 array

Bitmap Container

Used when > 4096 values

1001001010110101...

Fixed 8KB bitmap

Run Container

Consecutive sequences

[100-500, 1000-1200]

Run-length encoded

Why Roaring Wins

• Adapts to data density automatically per container
• Bitwise AND/OR operations work across container types
• 10-100× faster than naive bitsets for sparse data

2.3 BKD Trees for Numeric Constraints

Since Lucene 6.0, numeric range queries use BKD trees (Block K-Dimensional trees). They're O(log n) for range queries vs O(n) for scanning posting lists. See Chapter 3.3 for the full deep dive.

Query Execution: price: [100 TO 500]

Node entirely inside range:Accept all points (fast)

Node entirely outside range:Skip subtree (fast)

Node crosses boundary:Recurse deeper (slower)

Performance Tip

Index-sort documents by the most-queried numeric field. This makes data physically contiguous, leading to better cache locality and fewer disk seeks.

2.4 Doc Values (Column Store) for Faceting

The inverted index answers: "Which docs have term X?" Faceting needs the opposite: "What terms does doc X have?"Doc Values provide column-oriented storage for this pattern. Review Chapter 3.3 for implementation details.

Inverted Index

"Apple" → [Doc1, Doc3, Doc5]

"Samsung" → [Doc2, Doc4]

Best for: "Which docs contain 'Apple'?"

Doc Values (Column Store)

Doc1 → "Apple"

Doc2 → "Samsung"

Doc3 → "Apple"

Best for: "What brands exist in these 10K docs?"

// Memory Model

JVM Heap (GC-managed) ← NOT used by DocValues

Disk (.dvd files) ←→ OS Filesystem Cache ←→ CPU

↑ mmap'd, no GC pauses

3. Filter vs Query Context

In Elasticsearch/OpenSearch, clauses can run in "query" context (affects score) or "filter" context (binary, cacheable). This distinction is critical for performance—the same clause can be 10× slower in the wrong context.

Query context: When a clause participates in scoring, it must compute a relevance score for each matching document. This involves reading term frequencies, document lengths, and applying BM25 calculations.Filter context: Binary yes/no decision. No scoring, results cached as Roaring Bitmaps for reuse.

filter_vs_query.json

{

"query": {

"bool": {

"must": [ // Affects score

{ "match": { "title": "laptop" } }

"filter": [ // Binary, cached

{ "term": { "in_stock": true } },

{ "range": { "price": { "lte": 1000 } } }

]

}

Execution Order

1. Filters run first (fast, cached)
2. Scoring runs on filtered subset
3. Result: smaller candidate set for expensive scoring

Filter Cache

Key: hash(field, value, segment_id)

Value: RoaringBitmap [1, 4, 5, 7, ...]

Per-segment, LRU eviction

4. Faceted Navigation Architecture

Faceted navigation is the "refine your search" sidebar you see on e-commerce sites. It shows counts for each value, allowing users to drill down without hitting dead-ends. This is the feature that makes or breaks the search UX on sites like Amazon, Airbnb, or Netflix.

Computing facet counts requires reading every document in the result set and aggregating field values. This is CPU-intensive, which is why we use Doc Values (columnar storage) instead of the inverted index. The inverted index answers "which docs have term X?" but faceting needs "what values does doc X have?"—the inverse question.

What the User Sees

Computing Facet Counts

facet_count.py

def compute_facet(result_bitset, field, doc_values):

"""Count values across result set."""

counts = defaultdict(int)

for doc_id in result_bitset:

value = doc_values.get(doc_id, field) # Column lookup

counts[value] += 1

return sorted(counts.items(), key=lambda x: -x[1])

5. The Post-Filter Pattern

The hardest part of faceted navigation is handling the interaction between selected facets and displayed counts. This is where most implementations go wrong—either showing incorrect counts or creating dead-end selections.

Consider: when user selects "Color: Red", what should the color facet show? If you filter first, you'd only see "Red (50)"— the user can't see alternatives! The correct behavior is:

Results: Only red products (filtered by selection)
Color facet: Still shows Blue (40), Green (20) so user can switch their selection
Other facets: Only count within red products (combined filters apply)

The Solution: post_filter

post_filter applies after aggregations run, separating the "hits" from the "counts".

post_filter.json

{

"query": { "match": { "title": "shoes" } },

"post_filter": { "term": { "color": "red" } }, // Filters hits ONLY

"aggs": {

"colors": { "terms": { "field": "color" } } // Runs on ALL shoes

}

Step 1

Query runs

Gets all shoes

Step 2

Aggs run

Counts all colors in shoes

Step 3

Post-filter runs

Filters hits to only red

Case StudyNike.com Shoe Filter

User searches "running shoes" and selects Size: 10 + Color: Black. What should the facets show?

Size facet: Shows sizes available in black running shoes (not filtered by size)
Color facet: Shows colors available in size 10 running shoes (not filtered by color)
Brand facet: Shows brands available in size 10 + black running shoes (filtered by both)

Expected Facet Response

Size (ignore own filter)

8 (12)9 (18)10 (15)11 (8)

Color (ignore own filter)

Black (15)White (22)Blue (9)

6. Distributed Faceting Challenges

In a distributed index with multiple shards, facet counts become approximate. Each shard only sees its local data and returns its local top-N values. Review Chapter 3.6 for sharding architecture details.

This is the classic "scatter-gather" problem: the coordinator sends requests to all shards, each shard returns local results, and the coordinator merges them. For scoring, this works well because each document competes individually. For faceting, the counts must be summed across shards—but if a shard doesn't return a value in its top-N, that count is lost.

The Scatter-Gather Problem

Setup: Index has 5 shards. Query for "laptop" with facet on "brand".

Problem: Shard 1 might have "Lenovo: 50", but it's #11 locally so not returned. Global top-10 becomes inaccurate.

Solution: shard_size

Request more results per shard than needed.

shard_size = size × 1.5 + 10

(Elasticsearch default)

Solution: HyperLogLog++

For high-cardinality counting (unique users, etc.)

~40KB memory for 2-3% error

Probabilistic estimation

7. Performance Optimization

Filter and facet performance is dominated by I/O and cache efficiency. The key insight: filters run on immutablesegments, so caching is extremely effective. Understanding how to optimize both index-time schema decisions and query-time patterns can reduce latency by 10× on complex faceted queries.

Focus on two phases: Index-time decisions are permanent and affect all queries. Query-time optimizations can be applied per-query. Both matter, but index-time decisions have higher leverage.

Index-Time

Use keyword not text
Enables DocValues, faster aggregation
Enable DocValues explicitly
Default for non-text fields, but verify
Index sorting
Physical contiguity → better cache locality

Query-Time

Selective filters first
Reduces candidate set early
Limit facet size
"size": 10, not "size": 10000
Use composite for pagination
For exhaustive faceting over millions

Filesystem Cache Priority

1. .dvd filesDocValues data — Faceting

2. .dim/.diiBKD tree — Numeric filters

3. .tip/.timTerm index — Text filters

4. PostingsScoring (less critical for filters)

Rule of thumb: Leave 50% of RAM for OS filesystem cache

Real-World Filter Architectures

Amazon

Scale: 350M+ products

Strategy: Pre-computed facet caches per category. Heavy use of shard routing by category.

Optimization: Facets computed at category level, not global query level.

Spotify

Scale: 100M+ tracks

Strategy: BKD trees for Audio Features (tempo, energy, danceability). Enables "mood" filtering.

Optimization: Index-sorted by popularity for early termination.

Airbnb

Scale: 7M+ listings

Strategy: Geo-hashing + date availability filters. BKD trees for price ranges.

Challenge: Availability is highly volatile (requires near real-time updates).

Key Takeaways

Filters Don't Score

Filters are binary (yes/no) and highly cacheable. Put constraints in filter context, not must.

Faceting Needs DocValues

Column-oriented storage (DocValues) is essential for efficient aggregation. Never facet on text fields.

Post-Filter Pattern

Use post_filter to separate result filtering from aggregation counts for proper faceted navigation.

Roaring Bitmaps

Industry-standard compressed bitsets that adapt to data density. Powers all modern filter caching.

TF-IDF and BM25 Next: Execution Plans