Know your $/1M tokens before you rent the GPU.
Blog benchmarks go stale the month they're published. GPU CostWise keeps a living price/perf dataset — H100, B200, MI355X and friends — and computes what your model actually costs to serve.
Cost calculator
Free, no signup. Pick a model, set your volume, get a ranked stack.
Set your workload and hit COMPUTE COST to rank 26 GPU configs by $/1M tokens.
How the numbers are computed
FAQ
They come from a memory-bandwidth roofline model (55% MBU, 45% MFU compute ceiling) — the same first-principles method serving teams use for capacity planning. Absolute numbers are estimates; the relative ranking between GPUs is what you should trust, and that is robust.
Public on-demand pricing from Lambda, RunPod, Vast.ai, AWS, GCP, Vultr and TensorWave, hand-verified and stamped per entry. Current dataset: 2026-06, 26 offers across 12 GPUs.
Yes. VRAM is sized from total parameters, throughput from active parameters — so DeepSeek V3 or Qwen3-235B rank realistically instead of looking impossibly expensive.
The single-scenario calculator is free forever, no signup. Pro (₹999/$15 per month) adds saved scenarios, a monthly price-change feed for your stack, and shareable cost reports.