HNSW Deep Dive
Hierarchical Navigable Small World is the dominant graph-based ANN algorithm. It achieves O(log N) search by building a multi-layer graph where upper layers are express highways and lower layers are local streets. Used in Lucene/ES, Qdrant, Weaviate, and pgvector.
<1ms
Sub-millisecond latency for millions of vectors with 98-99% recall@10.
~63%
Of nodes only exist at Layer 0. ~1% reach Layer 4+. Geometric distribution.
~1.5 hrs
100M vectors, M=16, ef_construction=200 on 48-core Xeon.
1. From NSW to HNSW: The Skip List Insight
Before HNSW, Navigable Small World (NSW) graphs built a flat graph where each node connects to its approximate nearest neighbors. Search works by greedy traversal: start from any node, always move to the neighbor closest to the query. The "small world" property ensures short paths exist between any two nodes — but search complexity is O(N^(1/2)), not logarithmic.
HNSW's breakthrough was layering the graph like a probability skip list: Layer 0 contains all nodes with dense local connections. Higher layers contain exponentially fewer nodes with longer-range connections. Search starts at the top for coarse navigation and descends to the bottom for precision — achieving O(log N) search complexity.
2. Construction Algorithm
Each new vector is assigned a maximum layer using a geometric distribution: floor(-ln(random()) × m_L) where m_L = 1/ln(M). Most nodes stay at Layer 0 (~63%), with exponentially fewer reaching higher layers (~23% at Layer 1, ~9% at Layer 2, ~1% at Layer 4+).
Neighbor Selection Heuristic
Simple nearest-neighbor selection creates clusters — nodes all connect to the same popular neighbors. HNSW uses a diversity heuristic: a candidate is added only if it's closer to the query than to any existing selected neighbor. This ensures neighbors aren't all clustered in the same direction, promoting diverse paths through the graph.
3. Search Algorithm
Search starts at the entry point and does a greedy search from the top layer down to Layer 1 (ef=1 — just find the single nearest neighbor). At Layer 0, switch to beam search with ef_search width to explore more candidates. Return the top K from the beam search results.
| Operation | Complexity | Notes |
|---|---|---|
| Construction (per element) | O(log N) | Search + edge creation per layer |
| Construction (total) | O(N log N) | Insert N elements |
| Search | O(log N) | Layers = log N, each O(1) avg hops |
| Memory (vectors) | O(N × d) | Store all vectors |
| Memory (graph) | O(N × M) | M edges per node, ~4 bytes each |
4. Key Parameters & Tuning
| Parameter | Range | Effect | Memory |
|---|---|---|---|
| M | 8-64 (default 16) | Max connections per node per layer | +8 bytes/node per M increase |
| ef_construction | 64-512 (default 200) | Build-time beam width | No memory impact |
| ef_search | 32-512 | Query-time beam width (primary knob) | No memory impact |
| M_max | 2×M | Max connections at Layer 0 (denser base) | Affects Layer 0 density |
M=32-48, ef_construction=400, ef_search=200-500
M=16, ef_construction=200, ef_search=64-128
M=8-12, ef_construction=100, ef_search=32-64
Memory Formula
5. HNSW vs DiskANN
DiskANN (Microsoft, 2019) stores the Vamana graph on SSD instead of RAM, keeping only PQ codes in memory. This trades latency (5-20ms vs <1ms) for 15-50x lower memory cost at billion scale.
| Aspect | HNSW | DiskANN |
|---|---|---|
| Storage | RAM (all layers) | SSD (graph + vectors), RAM (PQ codes only) |
| Memory/vector | ~4 KB (768d + graph) | ~200 bytes (PQ codes + metadata) |
| Query latency | <1ms | 5-20ms (SSD I/O bound) |
| Recall@10 | 98-99% | 95-98% |
| Updates | Online (insert/delete) | Batch (rebuild for changes) |
| Vectors | HNSW (RAM) | DiskANN (RAM + SSD) | Cost Ratio |
|---|---|---|---|
| 10M | ~40 GB RAM | ~2 GB RAM + 40 GB SSD | 8x |
| 100M | ~400 GB RAM | ~20 GB RAM + 400 GB SSD | 10x |
| 1B | ~3.2 TB RAM | ~64 GB RAM + 3.2 TB SSD | 15-50x |
6. HNSW in Lucene / Elasticsearch
Lucene 9+ uses HNSW for kNN search. Unlike standalone HNSW (single graph), Lucene maintains per-segment HNSW graphs. When segments merge, the graph is rebuilt from scratch — this is the biggest operational challenge with vector search in Elasticsearch.
- Graph rebuild: Hierarchical structure from individual segments can't be combined — rebuilt from scratch = O(N log N)
- Memory spikes: 9M vectors with M=16 can allocate over 2.5 GB of heap during merge
- Blocking: Long merges block new segment creation → indexing latency spikes
| ES Setting | Default | Production Recommendation |
|---|---|---|
| m | 16 | 16-32 (higher for better recall) |
| ef_construction | 100 | 200-400 (worth the build cost) |
| num_candidates | varies | 100-200 for top-10 queries |
| similarity | cosine | cosine for text, l2 for images |
Key Takeaways
Skip List Analogy
HNSW layers work like a skip list: Layer 3 has express highways (few nodes), Layer 0 has local streets (all nodes). Search starts at the top for coarse navigation and zooms in at the bottom for precision.
O(log N) Search Complexity
Each layer roughly halves the search space. Greedy search descends from the top, then beam search at Layer 0 with ef_search width. 1B vectors needs only ~30 layers, each with O(1) average hops.
Three Key Parameters
M (max connections, default 16): affects recall and memory. ef_construction (build beam width, default 200): affects graph quality. ef_search (query beam width): the primary accuracy/speed knob at query time.
Memory Formula: 32 GB for 10M Vectors
Total = N × (d × 4 + M × 2 × 4) bytes. 10M × 768-dim × M=16 = 30.7 GB vectors + 1.28 GB graph = ~32 GB. At 1B: ~3.2 TB. This is why DiskANN exists.
Lucene Segment Merges Are the Operational Challenge
Lucene maintains per-segment HNSW graphs. When segments merge, the graph is rebuilt from scratch — O(N log N). Memory spikes during merge can hit 2.5 GB. Use larger initial segments and TieredMergePolicy.