Available Memory
64GB
Unified Memory
Memory Bandwidth
273GB/s
Basic1000 GB/s max
Why Bandwidth Matters
LLM inference is memory-bound. Higher bandwidth directly translates to faster token generation, making it more important than raw compute power.
Top Picks for Your Hardware
Qwen 2.5 1.5B
Qwen1.5B
FP16•very high
3.5 GB64 GB
Fast•54.6 t/s
GeneralCoding
ollama run qwen2.5:1.5bStableLM 2 1.6B
StableLM1.6B
FP16•very high
3.4 GB64 GB
Fast•51.2 t/s
GeneralCoding
ollama run stablelm2:1.6bGLM Edge 1.5B
GLM1.5B
FP16•very high
3.2 GB64 GB
Fast•54.6 t/s
General
ollama run glm-edge:1.5bAll Compatible Models
Llama 3.2 1B
Llama1B
FP16•very high
2.3 GB64 GB
Fast•81.9 t/s
General
ollama run llama3.2:1bGranite 3 MoE 1B
Granite1B
FP16•very high
2.3 GB64 GB
Fast•81.9 t/s
GeneralCoding
ollama run granite3-moe:1bQwen 2.5 0.5B
Qwen0.5B
FP16•very high
1.2 GB64 GB
Fast•164 t/s
General
ollama run qwen2.5:0.5bSmolLM2 360M
SmolLM0.36B
FP16•very high
0.8 GB64 GB
Fast•228 t/s
General
ollama run smollm2:360mGemma 2 2B
Gemma2B
FP16•very high
5.0 GB64 GB
Fast•40.9 t/s
General
ollama run gemma2:2bEXAONE 3.5 2.4B
EXAONE2.4B
FP16•very high
5.0 GB64 GB
Fast•34.1 t/s
GeneralCoding
ollama run exaone3.5:2.4bSmolLM2 135M
SmolLM0.135B
FP16•very high
0.3 GB64 GB
Fast•607 t/s
General
ollama run smollm2:135mGranite 3 Dense 2B
Granite2B
FP16•very high
4.2 GB64 GB
Fast•40.9 t/s
GeneralCoding
ollama run granite3-dense:2bSmolLM2 1.7B
SmolLM1.7B
FP16•very high
3.5 GB64 GB
Fast•48.2 t/s
GeneralCoding
ollama run smollm2:1.7bQwen 2.5 3B
Qwen3B
FP16•very high
7.0 GB64 GB
Good•27.3 t/s
GeneralCoding
ollama run qwen2.5:3bLlama 3.2 3B
Llama3B
FP16•very high
6.5 GB64 GB
Good•27.3 t/s
GeneralCoding
ollama run llama3.2:3bStarCoder2 3B
StarCoder3B
FP16•very high
6.5 GB64 GB
Good•27.3 t/s
Coding
ollama run starcoder2:3bKimi K1.5 A3B
Kimi3B
FP16•very high
6.5 GB64 GB
Good•27.3 t/s
GeneralReasoningMath
ollama run kimi-k1.5:a3bGranite 3 MoE 3B
Granite3B
FP16•very high
6.5 GB64 GB
Good•27.3 t/s
GeneralCoding
ollama run granite3-moe:3bPhi-3 Mini (3.8B)
Phi3.8B
FP16•very high
7.8 GB64 GB
Good•21.6 t/s
GeneralCodingReasoning
ollama run phi3:miniGLM Edge 4B
GLM4B
FP16•very high
8.2 GB64 GB
Good•20.5 t/s
GeneralCoding
ollama run glm-edge:4bCodestral 22B
Mistral22B
FP16•very high
44.0 GB64 GB
Very Slow•3.7 t/s
Coding
ollama run codestral:22bQwen 2.5 14B
Qwen14B
FP16•very high
29.0 GB64 GB
Slow•5.9 t/s
GeneralCodingReasoningMath
ollama run qwen2.5:14bInternLM 2.5 20B
InternLM20B
FP16•very high
40.0 GB64 GB
Very Slow•4.1 t/s
GeneralCodingReasoningMath
ollama run internlm2:20bSailor2 20B
Sailor20B
FP16•very high
40.0 GB64 GB
Very Slow•4.1 t/s
GeneralCodingReasoning
ollama run sailor2:20bCode Llama 13B
Llama13B
FP16•very high
26.0 GB64 GB
Slow•6.3 t/s
Coding
ollama run codellama:13bOrca 2 13B
Orca13B
FP16•very high
26.0 GB64 GB
Slow•6.3 t/s
GeneralReasoning
ollama run orca2:13b