Systems Atlas

Chapter 5

Retrieval

The wide net. Retrieval is the stage where we select the best few thousand candidates from billions of documents. It prioritizes Recall (finding everything relevant) over Precision (ranking it perfectly). This chapter covers Inverted Indices, WAND, and Vector Search.


In This Chapter

5.1 Recall vs. Precision

Why the "Unrecoverable Error" dictates retrieval architecture.

5.2 Boolean Retrieval

AND, OR, NOT and bitset operations.

5.3 TF-IDF & BM25

The math of keyword relevance counting.

5.6 WAND Algorithm

How to assume 10k results without scoring 1B docs.

5.8 HNSW (Vector)

Approximate Nearest Neighbor search in high dimensions.

5.9 Hybrid Retrieval

Merging Keyword and Vector scores (RRF vs Linear).