Why do my real bills differ from the calculator?

Three things move the bill outside the model: region (US-East is ~30% cheaper than EU on most SaaS), committed-use discounts (Pinecone 1-year prepay shaves ~20%), and traffic burst behaviour (a few daily spikes push serverless way past the average QPS we use).

Is pgvector really cheaper at 100M vectors?

Sticker price yes, total cost of ownership maybe. A CCX43 holding 100M @ 768d costs ~$120/mo. But you also pay for backups, monitoring (Grafana / Prometheus), an oncall person, and the productivity hit when an index rebuild takes 6 hours. For a 1-person team, managed often wins below 10M vectors.

How accurate is the RAM estimate?

We use raw vector size × 1.6 for HNSW overhead, which matches what we measured on Qdrant 1.10 and pgvector 0.7 with M=16, ef_construction=64. Higher M values (32, 64) inflate RAM 1.4-2x. If you use product quantisation or scalar quantisation, divide RAM by 4-8.

Should I pick by cost alone?

No. Filtering speed, hybrid search quality, multi-tenant isolation, and SDK ergonomics matter more than $50/mo difference. Cost mostly matters at the extremes — under $30/mo (hobby projects) or over $1000/mo (cost as a feature ask). In the middle, pick by team familiarity.

What is missing from this calculator?

Egress fees (some clouds charge $0.09/GB out), embedding API cost (separate budget — OpenAI text-embedding-3-small is $0.02 per 1M tokens), and read-replica overhead for HA setups. Add 15-25% buffer to whatever the tool quotes if you need 99.9%+ uptime.

Where do the prices come from?

Public pricing pages reviewed in April 2026: pinecone.io/pricing (serverless tier), qdrant.tech/pricing (managed), weaviate.io/pricing (Standard plan), and hetzner.com/cloud (CCX dedicated line). Pinecone read/write unit definitions are simplified — see their docs for exact RU sizing.

Back to Tools

Vector Database Cost CalculatorPinecone vs Qdrant Cloud vs Weaviate vs pgvector

Live monthly cost estimate at 1M-100M vector scale, sized for the embedding model and traffic profile you actually use. Numbers come from public pricing pages reviewed in April 2026.

Updated April 25, 2026 · By Jim Liu

TL;DR

• Under 1M vectors: Pinecone Serverless usually beats everything else on $/mo, often under $20.
• 1M – 10M vectors: Qdrant Cloud or Weaviate flatten the cost curve. Pinecone starts climbing fast on read-heavy workloads.
• 10M – 100M vectors: pgvector on a Hetzner CCX VPS runs 5-20x cheaper than SaaS, if your team has Postgres ops capacity.
• High QPS production app: Skip the cheapest. Pay for read-replica + monitoring, or your $40/mo bill becomes a $4000 outage.

Configure your workload

Vector count

Embedding dims

OpenAI text-embedding-3-small

Traffic profile

~2.5M reads/mo, daily ingest

Working set RAM (HNSW)

45.8 GB

Disk storage

35.8 GB

Cheapest at this scale

pgvector on Hetzner VPS$125/moCCX43 (64GB)

Postgres extension on a CCX-line dedicated VPS. Cheapest at scale if you accept ops work.

⚠ Past 32GB working set, sharding (Citus / horizontal partition by tenant) usually wins.

Side-by-side monthly cost

Provider	Hosting	Tier	Compute	Storage	Reads	Writes	Total / mo
pgvector (self-host)	Self-hosted	CCX43 (64GB)	$120	$5.00	—	—	$125
Pinecone	SaaS	Serverless (consumption)	—	$11.8	$85.5	$107	$204
Qdrant Cloud	SaaS	Provisioned 45.8GB RAM cluster	$461	$2.15	—	—	$464
Weaviate Cloud	SaaS	Standard managed cluster	$729,600	—	—	—	$729,600

Numbers reflect public pricing pages (Pinecone serverless, Qdrant Cloud managed, Weaviate Standard, Hetzner CCX line) reviewed April 2026. Real bills vary 15-30% with region, commit discounts, and traffic burstiness. Self-hosted rows exclude oncall + monitoring labour cost — assume +$200-800/mo if you want 99.9% uptime.

How the math works

RAM sizing uses raw vector bytes (dims × 4 bytes for float32) multiplied by 1.6× HNSW overhead. We measured this on Qdrant 1.10 and pgvector 0.7 at the default M=16 / ef_construction=64. If you crank M to 32 or use ef_construction=200 for higher recall, expect 1.4-2x more RAM. Quantisation (PQ, scalar) cuts RAM 4-8x but loses 2-5% recall.

Pinecone Serverless is modelled as $0.33/GB-month storage + $1.10 per million read units + $8.25 per million write units. Read units roughly map to a single small query, write units to a single vector upsert. Real RU consumption depends on top-k size, filter complexity, and namespace count — expect ±50% variance.

Qdrant Cloud uses provisioned RAM-hours at roughly $0.014/GB-hour plus $0.06/GB-month durable storage. Minimum cluster is 4GB. Reads and writes are bundled into the cluster capacity, so very high QPS may force you to a larger tier than RAM alone would suggest.

Weaviate Cloud Standard is priced as ~$0.095 per 1K vector-dimension-units per month with a $295 floor at the entry tier. Sandbox is free for 14-day demos but data wipes on expiry — use it for spikes, not production.

pgvector on Hetzner CCX picks the smallest CCX tier whose RAM exceeds your working set + 2GB OS overhead. CCX13 (8GB) ~$15, CCX23 (16GB) ~$30, CCX33 (32GB) ~$60, CCX43 (64GB) ~$120, CCX53 (128GB) ~$240. We add $5/mo for the backup volume. Past 32GB working set, sharding usually beats vertical scale.

What is excluded: embedding API cost (separate budget — text-embedding-3-small is ~$0.02 per 1M tokens), egress (some clouds bill $0.09/GB out), HA replicas, and the labour to set up monitoring on self-hosted. Realistic production bills run 30-50% higher than the calculator quote.

FAQ

: Three things move the bill outside the model: region (US-East is ~30% cheaper than EU on most SaaS), committed-use discounts (Pinecone 1-year prepay shaves ~20%), and traffic burst behaviour (a few daily spikes push serverless way past the average QPS we use).