Question 1

How accurate are the throughput numbers?

Accepted Answer

They come from a memory-bandwidth roofline model (55% MBU, 45% MFU compute ceiling) — the same first-principles method serving teams use for capacity planning. Absolute numbers are estimates; the relative ranking between GPUs is what you should trust, and that is robust.

Question 2

Where do prices come from?

Accepted Answer

Public on-demand pricing from Lambda, RunPod, Vast.ai, AWS, GCP, Vultr and TensorWave, hand-verified and stamped per entry. Current dataset: 2026-06, 26 offers across 12 GPUs.

Question 3

Does it handle MoE models?

Accepted Answer

Yes. VRAM is sized from total parameters, throughput from active parameters — so DeepSeek V3 or Qwen3-235B rank realistically instead of looking impossibly expensive.

Question 4

Is the calculator free?

Accepted Answer

The single-scenario calculator is free forever, no signup. Pro (₹999/$15 per month) adds saved scenarios, a monthly price-change feed for your stack, and shareable cost reports.

Know your $/1M tokens before you rent the GPU.

Cost calculator

How the numbers are computed

FAQ