Prompt: (raw) (yaml)
words:1 bytes:3
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 24 | 128 | 2.012727541s | 13.691666ms | 21 token(s) | 384.270083ms | 54.65 tokens/s | 34 token(s) | 1.614058625s | 21.06 tokens/s | 6.7B | 9.4 GB | 16384 | 8192 | 100% GPU |
cogito:3b | 7 | 34 | 454.339125ms | 31.105ms | 11 token(s) | 180.754208ms | 60.86 tokens/s | 10 token(s) | 241.839959ms | 41.35 tokens/s | 3.6B | 4.0 GB | 131072 | 8192 | 100% GPU |
cogito:8b | 7 | 34 | 939.180792ms | 31.219042ms | 11 token(s) | 413.528375ms | 26.60 tokens/s | 10 token(s) | 493.855584ms | 20.25 tokens/s | 8.0B | 7.0 GB | 131072 | 8192 | 100% GPU |
deepcoder:1.5b | 8 | 41 | 350.143333ms | 27.86025ms | 4 token(s) | 75.338334ms | 53.09 tokens/s | 16 token(s) | 246.33325ms | 64.95 tokens/s | 1.8B | 2.1 GB | 131072 | 8192 | 100% GPU |
deepseek-r1:1.5b | 8 | 41 | 376.409167ms | 28.453084ms | 4 token(s) | 97.854792ms | 40.88 tokens/s | 16 token(s) | 249.506333ms | 64.13 tokens/s | 1.8B | 2.1 GB | 131072 | 8192 | 100% GPU |
deepseek-r1:14b | 8 | 41 | 4.278190958s | 26.707333ms | 4 token(s) | 2.650595458s | 1.51 tokens/s | 16 token(s) | 1.59930775s | 10.00 tokens/s | 14.8B | 11 GB | 131072 | 8192 | 6%/94% CPU/GPU |
deepseek-r1:8b | 8 | 38 | 14.978978375s | 27.799375ms | 3 token(s) | 235.39525ms | 12.74 tokens/s | 248 token(s) | 14.715225834s | 16.85 tokens/s | 8.2B | 7.6 GB | 131072 | 8192 | 100% GPU |
dolphin-mistral:7b | 7 | 37 | 819.667125ms | 14.768625ms | 29 token(s) | 372.703ms | 77.81 tokens/s | 10 token(s) | 431.503083ms | 23.17 tokens/s | 7.2B | 6.4 GB | 32768 | 8192 | 100% GPU |
dolphin3:8b | 7 | 36 | 1.017390917s | 30.202875ms | 24 token(s) | 474.778209ms | 50.55 tokens/s | 10 token(s) | 511.78925ms | 19.54 tokens/s | 8.0B | 7.0 GB | 131072 | 8192 | 100% GPU |
gemma3:1b | 19 | 106 | 569.480959ms | 52.049959ms | 10 token(s) | 104.383833ms | 95.80 tokens/s | 29 token(s) | 412.486084ms | 70.31 tokens/s | 999.89M | 2.0 GB | 32768 | 8192 | 100% GPU |
gemma3:4b | 23 | 122 | 1.525136125s | 51.037ms | 10 token(s) | 264.550041ms | 37.80 tokens/s | 34 token(s) | 1.208988667s | 28.12 tokens/s | 4.3B | 5.9 GB | 131072 | 8192 | 100% GPU |
gemma3n:e2b | 26 | 136 | 1.50197325s | 50.735209ms | 10 token(s) | 226.064209ms | 44.24 tokens/s | 38 token(s) | 1.224547833s | 31.03 tokens/s | 4.5B | 4.8 GB | 32768 | 8192 | 100% GPU |
gemma3n:e4b | 30 | 174 | 2.980086917s | 54.259209ms | 10 token(s) | 771.078959ms | 12.97 tokens/s | 48 token(s) | 2.1542075s | 22.28 tokens/s | 6.9B | 6.2 GB | 32768 | 8192 | 100% GPU |
gemma:2b | 15 | 68 | 624.361583ms | 30.711875ms | 23 token(s) | 143.649542ms | 160.11 tokens/s | 22 token(s) | 449.452875ms | 48.95 tokens/s | 2.5B | 3.0 GB | 8192 | 8192 | 100% GPU |
granite3.3:2b | 7 | 36 | 471.273792ms | 18.81675ms | 44 token(s) | 247.371833ms | 177.87 tokens/s | 10 token(s) | 204.336167ms | 48.94 tokens/s | 2.5B | 3.3 GB | 131072 | 8192 | 100% GPU |
granite3.3:8b | 22 | 129 | 2.530903041s | 18.989833ms | 44 token(s) | 728.010209ms | 60.44 tokens/s | 31 token(s) | 1.783157416s | 17.38 tokens/s | 8.2B | 7.9 GB | 131072 | 8192 | 100% GPU |
hermes3:8b | 30 | 151 | 2.361233542s | 31.854209ms | 10 token(s) | 519.417375ms | 19.25 tokens/s | 36 token(s) | 1.809498167s | 19.90 tokens/s | 8.0B | 6.7 GB | 131072 | 8192 | 100% GPU |
llama3.1:8b-instruct-q4_1 | 17 | 83 | 1.694726959s | 31.69ms | 11 token(s) | 410.469792ms | 26.80 tokens/s | 21 token(s) | 1.252014375s | 16.77 tokens/s | 8.0B | 7.2 GB | 131072 | 8192 | 100% GPU |
llama3.2:1b | 6 | 27 | 279.841125ms | 31.378792ms | 26 token(s) | 127.127792ms | 204.52 tokens/s | 8 token(s) | 120.75075ms | 66.25 tokens/s | 1.2B | 2.8 GB | 131072 | 8192 | 100% GPU |
llama3.2:3b | 6 | 29 | 449.876833ms | 30.982208ms | 26 token(s) | 229.2165ms | 113.43 tokens/s | 8 token(s) | 189.055417ms | 42.32 tokens/s | 3.2B | 4.0 GB | 131072 | 8192 | 100% GPU |
llava-llama3:8b | 39 | 209 | 2.973242458s | 32.032833ms | 11 token(s) | 370.601833ms | 29.68 tokens/s | 46 token(s) | 2.570017125s | 17.90 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 20 | 101 | 959.324291ms | 14.267041ms | 11 token(s) | 251.693459ms | 43.70 tokens/s | 24 token(s) | 692.789208ms | 34.64 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 7 | 36 | 902.032166ms | 13.668666ms | 9 token(s) | 404.207333ms | 22.27 tokens/s | 11 token(s) | 483.542208ms | 22.75 tokens/s | 7.2B | 7.0 GB | 32768 | 8192 | 100% GPU |
minicpm-v:8b | 45 | 232 | 3.032024667s | 26.514334ms | 9 token(s) | 320.99325ms | 28.04 tokens/s | 54 token(s) | 2.6839545s | 20.12 tokens/s | 7.6B | 6.8 GB | 32768 | 8192 | 100% GPU |
mistral:7b | 46 | 242 | 3.069221209s | 14.462875ms | 6 token(s) | 318.883917ms | 18.82 tokens/s | 56 token(s) | 2.735204875s | 20.47 tokens/s | 7.2B | 6.4 GB | 32768 | 8192 | 100% GPU |
mistral:7b-instruct | 297 | 1730 | 22.196721584s | 14.108125ms | 5 token(s) | 227.340709ms | 21.99 tokens/s | 448 token(s) | 21.954625166s | 20.41 tokens/s | 7.2B | 6.4 GB | 32768 | 8192 | 100% GPU |
qwen2.5-coder:7b | 7 | 36 | 842.606375ms | 28.276584ms | 30 token(s) | 341.150625ms | 87.94 tokens/s | 10 token(s) | 472.620667ms | 21.16 tokens/s | 7.6B | 6.0 GB | 32768 | 8192 | 100% GPU |
qwen2.5vl:3b | 7 | 34 | 537.538041ms | 30.257916ms | 21 token(s) | 265.227333ms | 79.18 tokens/s | 10 token(s) | 241.504875ms | 41.41 tokens/s | 3.8B | 6.2 GB | 128000 | 8192 | 100% GPU |
qwen2.5vl:7b | 7 | 36 | 893.769542ms | 30.154834ms | 21 token(s) | 392.339667ms | 53.53 tokens/s | 10 token(s) | 470.7725ms | 21.24 tokens/s | 8.3B | 9.1 GB | 128000 | 8192 | 100% GPU |
qwen3:0.6b | 8 | 41 | 1.292759042s | 28.554ms | 11 token(s) | 90.813917ms | 121.13 tokens/s | 123 token(s) | 1.172873167s | 104.87 tokens/s | 751.63M | 2.3 GB | 40960 | 8192 | 100% GPU |
qwen3:1.7b | 8 | 41 | 2.112287666s | 26.238208ms | 11 token(s) | 136.211042ms | 80.76 tokens/s | 106 token(s) | 1.9492395s | 54.38 tokens/s | 2.0B | 3.0 GB | 40960 | 8192 | 100% GPU |
qwen3:14b | 8 | 41 | 14.445739917s | 26.11675ms | 11 token(s) | 3.192440333s | 3.45 tokens/s | 106 token(s) | 11.222667417s | 9.45 tokens/s | 14.8B | 12 GB | 40960 | 8192 | 5%/95% CPU/GPU |
qwen3:4b | 8 | 41 | 8.568415208s | 29.547333ms | 11 token(s) | 228.941542ms | 48.05 tokens/s | 229 token(s) | 8.309245667s | 27.56 tokens/s | 4.0B | 5.3 GB | 40960 | 8192 | 100% GPU |
qwen3:8b | 20 | 102 | 10.300869875s | 29.6745ms | 11 token(s) | 364.159666ms | 30.21 tokens/s | 168 token(s) | 9.906456s | 16.96 tokens/s | 8.2B | 7.6 GB | 40960 | 8192 | 100% GPU |
smollm2:1.7b | 7 | 34 | 430.936208ms | 17.824667ms | 30 token(s) | 198.989292ms | 150.76 tokens/s | 10 token(s) | 213.495792ms | 46.84 tokens/s | 1.7B | 4.7 GB | 8192 | 8192 | 100% GPU |
smollm2:135m | 58 | 288 | 497.961292ms | 18.285667ms | 31 token(s) | 44.659834ms | 694.14 tokens/s | 67 token(s) | 434.418125ms | 154.23 tokens/s | 134.52M | 1.2 GB | 8192 | 8192 | 100% GPU |
smollm2:360m | 7 | 34 | 199.286958ms | 18.7875ms | 31 token(s) | 76.008708ms | 407.85 tokens/s | 10 token(s) | 103.826292ms | 96.31 tokens/s | 361.82M | 1.9 GB | 8192 | 8192 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 8192 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 11G + 455M |
Sys OS | Darwin 24.5.0 |