Vector Database Comparison
The landscape includes purpose-built databases (Pinecone, Qdrant, Milvus, Weaviate), extensions of existing databases (pgvector), and search engines with vector capabilities (Elasticsearch). Each makes fundamentally different tradeoffs in architecture, performance, and operational complexity.
pgvector (PostgreSQL Extension)
If you already run PostgreSQL, pgvector is the path of least resistance. One CREATE EXTENSION vector and you have a working vector column type with HNSW indexing, cosine/L2/inner product distance operators, and the ability to combine vector similarity with SQL WHERE clauses, JOINs, and aggregations in a single query. No new infrastructure, no new operational runbook, no new backup strategy. Your vectors live alongside your relational data with ACID guarantees.
The limitation is performance at scale. pgvector runs inside the PostgreSQL process, sharing memory with your OLTP workload. HNSW index builds are single-threaded and lock the table. At >5M vectors, purpose-built databases outperform it by 2-5x on QPS. pgvector also lacks advanced features like product quantization, tiered storage, or GPU-accelerated search. But for <5M vectors, none of that matters — and the SQL integration is unbeatable.
- Zero additional infrastructure — one CREATE EXTENSION
- ACID transactional consistency with relational data
- Rich SQL filtering, joins, aggregations
- Performance ceiling at >5M vectors
- Resource contention with OLTP workload
- No PQ, tiered storage, or GPU search
Elasticsearch (with kNN)
Elasticsearch added dense vector fields and HNSW-based kNN search in version 8.0. The killer feature is native hybrid search: a single query can combine BM25 text scoring with kNN vector similarity using Reciprocal Rank Fusion (RRF) — no external orchestration needed. If you already run Elasticsearch for text search, adding vector capabilities to existing indices is straightforward.
The architectural limitation is that Elasticsearch wasn't designed for vectors. Each Lucene segment contains its own HNSW graph, and segment merges trigger full graph rebuilds (causing CPU spikes). Vectors are stored on-heap, competing with the JVM's garbage collector and text search caches. At the same scale, purpose-built vector databases achieve 2-5x higher QPS. Still, for teams already invested in the Elastic ecosystem, the operational simplicity of co-locating text and vectors is often worth the performance trade-off.
- Native hybrid search: BM25 + kNN with built-in RRF fusion
- Mature ecosystem: Kibana, Elastic Cloud, decade of production use
- Co-located text + vectors in same index
- Segment merges rebuild HNSW graph (CPU spikes)
- On-heap vectors compete with JVM/text search
- Purpose-built DBs achieve 2-5x better QPS
Purpose-Built Vector Databases
These databases are engineered from the ground up for vector workloads: custom memory layouts for SIMD-optimized distance computation, purpose-built index structures, and APIs designed around the embed-index-query workflow. They achieve the highest QPS and lowest latency at scale, but come with varying levels of operational complexity. The four major options occupy different points on the managed-vs-self-hosted and simplicity-vs-capability spectrum.
Pinecone
Fully managed, serverless. Zero operations — API endpoint, query it. Serverless pricing (pay per query/GB).
Best for: Teams without infra expertise, startups, <10M vectors
Watch out: Vendor lock-in, limited control, cost at high QPS
Qdrant
Open-source, Rust. Payload indexes for filtered search during HNSW traversal. Docker → K8s → Cloud.
Best for: Self-hosted perf, filtered search, up to ~500M vectors
Watch out: Younger ecosystem, no native BM25
Weaviate
Open-source, Go. Native hybrid search (BM25 + vector). Modular vectorization (auto-embed via OpenAI/Cohere). Multi-tenancy.
Best for: All-in-one hybrid search, SaaS multi-tenant, auto-embedding
Watch out: Memory intensive (~50% more overhead), slower QPS
Milvus
Open-source, distributed. Widest index variety: HNSW, IVF-PQ, DiskANN, GPU indexes. Managed via Zilliz Cloud.
Best for: 100M+ vectors, GPU-accelerated search, index variety
Watch out: Requires etcd + MinIO + Pulsar, high ops complexity
Head-to-Head Comparison
The table below compares all six options across the dimensions that matter most in practice: implementation language (affects performance characteristics), self-hosting support, maximum practical scale, hybrid search capability, quantization options, operational complexity, latency at 1M vectors, and approximate monthly cost. Use this as a starting point, then drill into the specific factors that matter for your use case.
| Feature | pgvector | ES | Pinecone | Qdrant | Weaviate | Milvus |
|---|---|---|---|---|---|---|
| Language | C | Java | — | Rust | Go | Go+C++ |
| Self-hosted | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ |
| Max scale | ~5M | ~50M | ~100M+ | ~500M | ~100M | 10B+ |
| Hybrid search | ✅ FTS | ✅ Native | ❌ | ❌ | ✅ | ❌ |
| Quantization | ❌ | Scalar | Auto | SQ/PQ/BQ | PQ/BQ | PQ/SQ8/GPU |
| Ops complexity | Minimal | Medium | Zero | Low-Med | Medium | High |
| Latency (1M) | ~5ms | ~3ms | ~2ms | ~1ms | ~3ms | ~1ms |
| Cost (1M) | ~$50 | ~$300 | ~$70 | ~$100 | ~$150 | ~$400+ |
Decision Framework
The most common mistake is over-engineering: choosing Milvus for 500K vectors, or Pinecone when you already have PostgreSQL. Walk through these questions in order — stop at the first "yes" that matches your situation. The flowchart below encodes the decision logic that experienced teams follow.
Key Takeaways
pgvector: Simplest for <5M Vectors
Zero additional infrastructure — CREATE EXTENSION away. ACID transactions, SQL filtering, joins. But: performance ceiling at >5M vectors, resource contention with OLTP, limited ANN sophistication.
Elasticsearch: Best for Hybrid Search
Native BM25 + HNSW kNN with built-in RRF fusion. Mature ecosystem (Kibana, Elastic Cloud). But: segment merge rebuilds HNSW graph, on-heap vectors compete with JVM, not purpose-built for vectors.
Qdrant: Best Performance Per Dollar (Self-Hosted)
Rust implementation delivers excellent QPS. Payload indexes enable efficient filtered search during HNSW traversal. Flexible deployment: Docker → K8s → Cloud. Scales to ~500M vectors.
Milvus: Purpose-Built for Billion Scale
Distributed architecture handles 10B+ vectors. Widest index variety: HNSW, IVF-PQ, DiskANN, GPU indexes. But: requires etcd + MinIO + Pulsar — significant operational burden.
Don't Over-Engineer
The most common mistake is choosing a specialized vector database for a use case pgvector or Elasticsearch handles perfectly. If you have <5M vectors and already run PostgreSQL, pgvector is the answer.