Skip to main content
Back to Tools

Vector Database Cost CalculatorPinecone vs Qdrant Cloud vs Weaviate vs pgvector

Live monthly cost estimate at 1M-100M vector scale, sized for the embedding model and traffic profile you actually use. Numbers come from public pricing pages reviewed in April 2026.

Updated April 25, 2026 · By Jim Liu

TL;DR

  • Under 1M vectors: Pinecone Serverless usually beats everything else on $/mo, often under $20.
  • 1M – 10M vectors: Qdrant Cloud or Weaviate flatten the cost curve. Pinecone starts climbing fast on read-heavy workloads.
  • 10M – 100M vectors: pgvector on a Hetzner CCX VPS runs 5-20x cheaper than SaaS, if your team has Postgres ops capacity.
  • High QPS production app: Skip the cheapest. Pay for read-replica + monitoring, or your $40/mo bill becomes a $4000 outage.

Configure your workload

OpenAI text-embedding-3-small

~2.5M reads/mo, daily ingest

Working set RAM (HNSW)
45.8 GB
Disk storage
35.8 GB
Cheapest at this scale
pgvector on Hetzner VPS$125/moCCX43 (64GB)

Postgres extension on a CCX-line dedicated VPS. Cheapest at scale if you accept ops work.

Past 32GB working set, sharding (Citus / horizontal partition by tenant) usually wins.

Side-by-side monthly cost

ProviderHostingTierComputeStorageReadsWritesTotal / mo
pgvector (self-host)
Self-hostedCCX43 (64GB)$120$5.00$125
Pinecone
SaaSServerless (consumption)$11.8$85.5$107$204
Qdrant Cloud
SaaSProvisioned 45.8GB RAM cluster$461$2.15$464
Weaviate Cloud
SaaSStandard managed cluster$729,600$729,600

Numbers reflect public pricing pages (Pinecone serverless, Qdrant Cloud managed, Weaviate Standard, Hetzner CCX line) reviewed April 2026. Real bills vary 15-30% with region, commit discounts, and traffic burstiness. Self-hosted rows exclude oncall + monitoring labour cost — assume +$200-800/mo if you want 99.9% uptime.

How the math works

RAM sizing uses raw vector bytes (dims × 4 bytes for float32) multiplied by 1.6× HNSW overhead. We measured this on Qdrant 1.10 and pgvector 0.7 at the default M=16 / ef_construction=64. If you crank M to 32 or use ef_construction=200 for higher recall, expect 1.4-2x more RAM. Quantisation (PQ, scalar) cuts RAM 4-8x but loses 2-5% recall.

Pinecone Serverless is modelled as $0.33/GB-month storage + $1.10 per million read units + $8.25 per million write units. Read units roughly map to a single small query, write units to a single vector upsert. Real RU consumption depends on top-k size, filter complexity, and namespace count — expect ±50% variance.

Qdrant Cloud uses provisioned RAM-hours at roughly $0.014/GB-hour plus $0.06/GB-month durable storage. Minimum cluster is 4GB. Reads and writes are bundled into the cluster capacity, so very high QPS may force you to a larger tier than RAM alone would suggest.

Weaviate Cloud Standard is priced as ~$0.095 per 1K vector-dimension-units per month with a $295 floor at the entry tier. Sandbox is free for 14-day demos but data wipes on expiry — use it for spikes, not production.

pgvector on Hetzner CCX picks the smallest CCX tier whose RAM exceeds your working set + 2GB OS overhead. CCX13 (8GB) ~$15, CCX23 (16GB) ~$30, CCX33 (32GB) ~$60, CCX43 (64GB) ~$120, CCX53 (128GB) ~$240. We add $5/mo for the backup volume. Past 32GB working set, sharding usually beats vertical scale.

What is excluded: embedding API cost (separate budget — text-embedding-3-small is ~$0.02 per 1M tokens), egress (some clouds bill $0.09/GB out), HA replicas, and the labour to set up monitoring on self-hosted. Realistic production bills run 30-50% higher than the calculator quote.

FAQ

Three things move the bill outside the model: region (US-East is ~30% cheaper than EU on most SaaS), committed-use discounts (Pinecone 1-year prepay shaves ~20%), and traffic burst behaviour (a few daily spikes push serverless way past the average QPS we use).

We use analytics to understand how visitors use the site — no ads, no cross-site tracking. Privacy Policy