Vector Database Cost CalculatorPinecone vs Qdrant Cloud vs Weaviate vs pgvector
Live monthly cost estimate at 1M-100M vector scale, sized for the embedding model and traffic profile you actually use. Numbers come from public pricing pages reviewed in April 2026.
Updated April 25, 2026 · By Jim Liu
TL;DR
- • Under 1M vectors: Pinecone Serverless usually beats everything else on $/mo, often under $20.
- • 1M – 10M vectors: Qdrant Cloud or Weaviate flatten the cost curve. Pinecone starts climbing fast on read-heavy workloads.
- • 10M – 100M vectors: pgvector on a Hetzner CCX VPS runs 5-20x cheaper than SaaS, if your team has Postgres ops capacity.
- • High QPS production app: Skip the cheapest. Pay for read-replica + monitoring, or your $40/mo bill becomes a $4000 outage.
Configure your workload
OpenAI text-embedding-3-small
~2.5M reads/mo, daily ingest
Postgres extension on a CCX-line dedicated VPS. Cheapest at scale if you accept ops work.
⚠ Past 32GB working set, sharding (Citus / horizontal partition by tenant) usually wins.
Side-by-side monthly cost
| Provider | Hosting | Tier | Compute | Storage | Reads | Writes | Total / mo |
|---|---|---|---|---|---|---|---|
pgvector (self-host) | Self-hosted | CCX43 (64GB) | $120 | $5.00 | — | — | $125 |
Pinecone | SaaS | Serverless (consumption) | — | $11.8 | $85.5 | $107 | $204 |
Qdrant Cloud | SaaS | Provisioned 45.8GB RAM cluster | $461 | $2.15 | — | — | $464 |
Weaviate Cloud | SaaS | Standard managed cluster | $729,600 | — | — | — | $729,600 |
Numbers reflect public pricing pages (Pinecone serverless, Qdrant Cloud managed, Weaviate Standard, Hetzner CCX line) reviewed April 2026. Real bills vary 15-30% with region, commit discounts, and traffic burstiness. Self-hosted rows exclude oncall + monitoring labour cost — assume +$200-800/mo if you want 99.9% uptime.
How the math works
RAM sizing uses raw vector bytes (dims × 4 bytes for float32) multiplied by 1.6× HNSW overhead. We measured this on Qdrant 1.10 and pgvector 0.7 at the default M=16 / ef_construction=64. If you crank M to 32 or use ef_construction=200 for higher recall, expect 1.4-2x more RAM. Quantisation (PQ, scalar) cuts RAM 4-8x but loses 2-5% recall.
Pinecone Serverless is modelled as $0.33/GB-month storage + $1.10 per million read units + $8.25 per million write units. Read units roughly map to a single small query, write units to a single vector upsert. Real RU consumption depends on top-k size, filter complexity, and namespace count — expect ±50% variance.
Qdrant Cloud uses provisioned RAM-hours at roughly $0.014/GB-hour plus $0.06/GB-month durable storage. Minimum cluster is 4GB. Reads and writes are bundled into the cluster capacity, so very high QPS may force you to a larger tier than RAM alone would suggest.
Weaviate Cloud Standard is priced as ~$0.095 per 1K vector-dimension-units per month with a $295 floor at the entry tier. Sandbox is free for 14-day demos but data wipes on expiry — use it for spikes, not production.
pgvector on Hetzner CCX picks the smallest CCX tier whose RAM exceeds your working set + 2GB OS overhead. CCX13 (8GB) ~$15, CCX23 (16GB) ~$30, CCX33 (32GB) ~$60, CCX43 (64GB) ~$120, CCX53 (128GB) ~$240. We add $5/mo for the backup volume. Past 32GB working set, sharding usually beats vertical scale.
What is excluded: embedding API cost (separate budget — text-embedding-3-small is ~$0.02 per 1M tokens), egress (some clouds bill $0.09/GB out), HA replicas, and the labour to set up monitoring on self-hosted. Realistic production bills run 30-50% higher than the calculator quote.
FAQ
- Three things move the bill outside the model: region (US-East is ~30% cheaper than EU on most SaaS), committed-use discounts (Pinecone 1-year prepay shaves ~20%), and traffic burst behaviour (a few daily spikes push serverless way past the average QPS we use).