Chapter 6

Vector & Semantic Search

Beyond keywords. This chapter covers how embedding models transform queries and documents into dense vectors, enabling meaning-based retrieval that understands synonyms, intent, and conceptual similarity — and the infrastructure needed to do it at scale.

Start Chapter

In This Chapter

6.1 Why Keywords Aren't Enough

Vocabulary mismatch, synonym blindness, and why BM25 fails on conceptual queries.

6.2 Embeddings 101

From Word2Vec to SBERT: how text becomes vectors and why dimensions matter.

6.3 Bi-Encoder vs Cross-Encoder

Speed vs accuracy in neural ranking — and why the best systems use both.

6.4 Vector Indexing Basics

IVF, Product Quantization, and how to search billions of vectors efficiently.

6.5 HNSW Deep Dive

The dominant ANN algorithm: navigable small world graphs, construction, and tuning.

6.6 Latency vs Recall

Tuning ef_search, nprobe, and the diminishing returns of chasing perfect recall.

6.7 Hybrid Ranking

Combining BM25 + vectors with RRF and linear fusion for robust retrieval.

6.8 Failure Modes

When semantic search fails silently: exact match, negation, domain mismatch.

6.9 Cost at Scale

Memory math, quantization savings, and TCO for billion-vector deployments.

6.10 Chunking Strategies

How document splitting impacts retrieval quality more than model choice.

6.11 Vector Databases

Pinecone vs Qdrant vs pgvector: architecture, tradeoffs, and when to use each.

6.12 Evaluating Quality

MRR, nDCG, Recall@K — measuring search quality and building evaluation datasets.

5. Retrieval Next: Training Embeddings