Estimate the full cost of a Retrieval-Augmented Generation pipeline: document embedding, query costs, and LLM generation. Adjust parameters to find the most cost-effective setup.
| Total chunks to embed | 2,000 |
| Tokens per chunk | 512 |
| Total tokens to embed | 1,024,000 |
| Indexing cost | $0.020 |
| Monthly queries | 3,000 |
| Query embedding cost | $0.0030/mo |
| LLM generation cost2,710 input + 500 output tokens/query | $46.89/mo |
| Total monthly cost | $46.89/mo |
Estimates assume ~50 tokens per query and ~100 system prompt tokens. Actual costs depend on your tokenizer, chunk strategy, and provider pricing. Vector database storage costs (e.g., Pinecone, Weaviate) are not included. All calculations run locally in your browser.