Find the right GPU for inference, fine-tuning, or training open-weight language models on Bunya.
Covers Llama, Qwen, DeepSeek, Gemma, Mistral, GLM, Phi, and more.
Approximate VRAM for inference (model weights + KV cache overhead). Fine-tuning requires significantly more — see the picker above. MoE models must load all parameters even though only a subset are active per token.
| Model Family | Variant | Params | FP16 | INT8 | INT4 | Bunya GPU (inference, FP16) |
|---|---|---|---|---|---|---|
| Llama 3.1/3.3 | 8B | 8B | ~18 GB | ~10 GB | ~5 GB | A100 MIG 20GB+, MI210 |
| 70B | 70B | ~168 GB | ~85 GB | ~44 GB | MI300x (192GB) or 3× H100 | |
| 405B | 405B | ~970 GB | ~485 GB | ~245 GB | Multi-node (H100 SXM5 or MI300x) | |
| Llama 4 | Scout (MoE) | 109B (17B active) | ~260 GB | ~130 GB | ~66 GB | 2× MI300x or 4× H100 |
| Maverick (MoE) | 400B (17B active) | ~960 GB | ~480 GB | ~240 GB | Multi-node (H100 SXM5 or MI300x) | |
| Qwen 3 / 3.5 | 7-8B | 7-8B | ~18 GB | ~10 GB | ~5 GB | A100 MIG 20GB+, MI210 |
| 27-32B | 27-32B | ~70 GB | ~36 GB | ~18 GB | H100 (80GB) or MI210 (64GB, INT8) | |
| 72B | 72B | ~173 GB | ~87 GB | ~44 GB | MI300x or 3× H100 | |
| 235B (MoE, 22B active) | 235B | ~564 GB | ~282 GB | ~141 GB | 4× H100 SXM5 or 2× MI300x (INT8) | |
| DeepSeek | R1-distill 7/8B | 7-8B | ~18 GB | ~10 GB | ~5 GB | A100 MIG 20GB+, MI210 |
| R1-distill 14B | 14B | ~34 GB | ~17 GB | ~9 GB | A100 MIG 40GB, MI210 (64GB) | |
| R1-distill 70B | 70B | ~168 GB | ~85 GB | ~44 GB | MI300x or 3× H100 | |
| V3.2 / R1 (MoE) | 671-685B (37B active) | ~1.6 TB | ~820 GB | ~410 GB | Multi-node MI300x cluster | |
| Gemma | 2/3 9B | 9B | ~22 GB | ~11 GB | ~6 GB | A100 MIG 40GB, MI210 |
| 2/3 27B | 27B | ~65 GB | ~33 GB | ~17 GB | H100 (80GB) or MI210 (64GB, INT8) | |
| Gemma 4 | 27B (MoE) | 27B (14B active) | ~65 GB | ~33 GB | ~17 GB | H100 (80GB) or MI210 (64GB, INT8) |
| Mistral | 7B / Nemo 12B | 7-12B | ~18-29 GB | ~9-15 GB | ~5-8 GB | A100 MIG 20-40GB, MI210 |
| Small 3 (24B) | 24B | ~58 GB | ~29 GB | ~15 GB | MI210 (64GB) or H100 (80GB) | |
| Large 2 (123B) | 123B | ~295 GB | ~148 GB | ~74 GB | 2× MI300x or 4× H100 | |
| Phi | 3/4-mini (3-4B) | 3-4B | ~10 GB | ~5 GB | ~3 GB | A100 MIG 10GB |
| 4 (14B) | 14B | ~34 GB | ~17 GB | ~9 GB | A100 MIG 40GB, MI210 (64GB) | |
| GLM | 4.7 (355B) | 355B | ~852 GB | ~426 GB | ~213 GB | Multi-node MI300x or H100 SXM5 |
| 5 (744B) | 744B | ~1.8 TB | ~893 GB | ~447 GB | Multi-node MI300x cluster |