Chapter 6
Vector & Semantic Search
Beyond keywords. This chapter covers how embedding models transform queries and documents into dense vectors, enabling meaning-based retrieval that understands synonyms, intent, and conceptual similarity — and the infrastructure needed to do it at scale.
In This Chapter
6.1 Why Keywords Aren't Enough
Vocabulary mismatch, synonym blindness, and why BM25 fails on conceptual queries.
6.2 Embeddings 101
From Word2Vec to SBERT: how text becomes vectors and why dimensions matter.
6.3 Bi-Encoder vs Cross-Encoder
Speed vs accuracy in neural ranking — and why the best systems use both.
6.4 Vector Indexing Basics
IVF, Product Quantization, and how to search billions of vectors efficiently.
6.5 HNSW Deep Dive
The dominant ANN algorithm: navigable small world graphs, construction, and tuning.
6.6 Latency vs Recall
Tuning ef_search, nprobe, and the diminishing returns of chasing perfect recall.
6.7 Hybrid Ranking
Combining BM25 + vectors with RRF and linear fusion for robust retrieval.
6.8 Failure Modes
When semantic search fails silently: exact match, negation, domain mismatch.
6.9 Cost at Scale
Memory math, quantization savings, and TCO for billion-vector deployments.
6.10 Chunking Strategies
How document splitting impacts retrieval quality more than model choice.
6.11 Vector Databases
Pinecone vs Qdrant vs pgvector: architecture, tradeoffs, and when to use each.
6.12 Evaluating Quality
MRR, nDCG, Recall@K — measuring search quality and building evaluation datasets.