Documents

Number of documents

Avg document size (tokens)

Chunk size (tokens)

Chunk overlap (tokens)

Retrieval

Queries per day

Chunks retrieved per query

Generation model

Avg output tokens per response

Embedding

Embedding model

Provider: OpenAI

Dimensions: 1,536

Max input: 8,191 tokens

Price: $0.02/1M tokens

Cost Breakdown

2,000Total Chunks

$0.020Indexing (one-time)

$0.016Per Query

$46.89Monthly Total

One-Time Indexing

Total chunks to embed	2,000
Tokens per chunk	512
Total tokens to embed	1,024,000
Indexing cost	$0.020

Monthly Query Costs

Monthly queries	3,000
Query embedding cost	$0.0030/mo
LLM generation cost2,710 input + 500 output tokens/query	$46.89/mo
Total monthly cost	$46.89/mo

Save your results — get the cheatsheet

Drop your email and I'll send over my RAG Architecture Playbook (free PDF). You'll also join the newsletter — unsubscribe anytime.

Or preview the RAG Architecture Playbook landing page →

Estimates assume ~50 tokens per query and ~100 system prompt tokens. Actual costs depend on your tokenizer, chunk strategy, and provider pricing. Vector database storage costs (e.g., Pinecone, Weaviate) are not included. All calculations run locally in your browser.

<iframe src="https://www.kunalganglani.com/embed/rag-cost-calculator" width="100%" height="600" style="border:none;border-radius:8px;" loading="lazy"></iframe>