ollama-multirun: hi: 20250713-163234

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:1 bytes:3

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 24 128 2.012727541s 13.691666ms 21 token(s) 384.270083ms 54.65 tokens/s 34 token(s) 1.614058625s 21.06 tokens/s 6.7B 9.4 GB 16384 8192 100% GPU
cogito:3b 7 34 454.339125ms 31.105ms 11 token(s) 180.754208ms 60.86 tokens/s 10 token(s) 241.839959ms 41.35 tokens/s 3.6B 4.0 GB 131072 8192 100% GPU
cogito:8b 7 34 939.180792ms 31.219042ms 11 token(s) 413.528375ms 26.60 tokens/s 10 token(s) 493.855584ms 20.25 tokens/s 8.0B 7.0 GB 131072 8192 100% GPU
deepcoder:1.5b 8 41 350.143333ms 27.86025ms 4 token(s) 75.338334ms 53.09 tokens/s 16 token(s) 246.33325ms 64.95 tokens/s 1.8B 2.1 GB 131072 8192 100% GPU
deepseek-r1:1.5b 8 41 376.409167ms 28.453084ms 4 token(s) 97.854792ms 40.88 tokens/s 16 token(s) 249.506333ms 64.13 tokens/s 1.8B 2.1 GB 131072 8192 100% GPU
deepseek-r1:14b 8 41 4.278190958s 26.707333ms 4 token(s) 2.650595458s 1.51 tokens/s 16 token(s) 1.59930775s 10.00 tokens/s 14.8B 11 GB 131072 8192 6%/94% CPU/GPU
deepseek-r1:8b 8 38 14.978978375s 27.799375ms 3 token(s) 235.39525ms 12.74 tokens/s 248 token(s) 14.715225834s 16.85 tokens/s 8.2B 7.6 GB 131072 8192 100% GPU
dolphin-mistral:7b 7 37 819.667125ms 14.768625ms 29 token(s) 372.703ms 77.81 tokens/s 10 token(s) 431.503083ms 23.17 tokens/s 7.2B 6.4 GB 32768 8192 100% GPU
dolphin3:8b 7 36 1.017390917s 30.202875ms 24 token(s) 474.778209ms 50.55 tokens/s 10 token(s) 511.78925ms 19.54 tokens/s 8.0B 7.0 GB 131072 8192 100% GPU
gemma3:1b 19 106 569.480959ms 52.049959ms 10 token(s) 104.383833ms 95.80 tokens/s 29 token(s) 412.486084ms 70.31 tokens/s 999.89M 2.0 GB 32768 8192 100% GPU
gemma3:4b 23 122 1.525136125s 51.037ms 10 token(s) 264.550041ms 37.80 tokens/s 34 token(s) 1.208988667s 28.12 tokens/s 4.3B 5.9 GB 131072 8192 100% GPU
gemma3n:e2b 26 136 1.50197325s 50.735209ms 10 token(s) 226.064209ms 44.24 tokens/s 38 token(s) 1.224547833s 31.03 tokens/s 4.5B 4.8 GB 32768 8192 100% GPU
gemma3n:e4b 30 174 2.980086917s 54.259209ms 10 token(s) 771.078959ms 12.97 tokens/s 48 token(s) 2.1542075s 22.28 tokens/s 6.9B 6.2 GB 32768 8192 100% GPU
gemma:2b 15 68 624.361583ms 30.711875ms 23 token(s) 143.649542ms 160.11 tokens/s 22 token(s) 449.452875ms 48.95 tokens/s 2.5B 3.0 GB 8192 8192 100% GPU
granite3.3:2b 7 36 471.273792ms 18.81675ms 44 token(s) 247.371833ms 177.87 tokens/s 10 token(s) 204.336167ms 48.94 tokens/s 2.5B 3.3 GB 131072 8192 100% GPU
granite3.3:8b 22 129 2.530903041s 18.989833ms 44 token(s) 728.010209ms 60.44 tokens/s 31 token(s) 1.783157416s 17.38 tokens/s 8.2B 7.9 GB 131072 8192 100% GPU
hermes3:8b 30 151 2.361233542s 31.854209ms 10 token(s) 519.417375ms 19.25 tokens/s 36 token(s) 1.809498167s 19.90 tokens/s 8.0B 6.7 GB 131072 8192 100% GPU
llama3.1:8b-instruct-q4_1 17 83 1.694726959s 31.69ms 11 token(s) 410.469792ms 26.80 tokens/s 21 token(s) 1.252014375s 16.77 tokens/s 8.0B 7.2 GB 131072 8192 100% GPU
llama3.2:1b 6 27 279.841125ms 31.378792ms 26 token(s) 127.127792ms 204.52 tokens/s 8 token(s) 120.75075ms 66.25 tokens/s 1.2B 2.8 GB 131072 8192 100% GPU
llama3.2:3b 6 29 449.876833ms 30.982208ms 26 token(s) 229.2165ms 113.43 tokens/s 8 token(s) 189.055417ms 42.32 tokens/s 3.2B 4.0 GB 131072 8192 100% GPU
llava-llama3:8b 39 209 2.973242458s 32.032833ms 11 token(s) 370.601833ms 29.68 tokens/s 46 token(s) 2.570017125s 17.90 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 20 101 959.324291ms 14.267041ms 11 token(s) 251.693459ms 43.70 tokens/s 24 token(s) 692.789208ms 34.64 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 7 36 902.032166ms 13.668666ms 9 token(s) 404.207333ms 22.27 tokens/s 11 token(s) 483.542208ms 22.75 tokens/s 7.2B 7.0 GB 32768 8192 100% GPU
minicpm-v:8b 45 232 3.032024667s 26.514334ms 9 token(s) 320.99325ms 28.04 tokens/s 54 token(s) 2.6839545s 20.12 tokens/s 7.6B 6.8 GB 32768 8192 100% GPU
mistral:7b 46 242 3.069221209s 14.462875ms 6 token(s) 318.883917ms 18.82 tokens/s 56 token(s) 2.735204875s 20.47 tokens/s 7.2B 6.4 GB 32768 8192 100% GPU
mistral:7b-instruct 297 1730 22.196721584s 14.108125ms 5 token(s) 227.340709ms 21.99 tokens/s 448 token(s) 21.954625166s 20.41 tokens/s 7.2B 6.4 GB 32768 8192 100% GPU
qwen2.5-coder:7b 7 36 842.606375ms 28.276584ms 30 token(s) 341.150625ms 87.94 tokens/s 10 token(s) 472.620667ms 21.16 tokens/s 7.6B 6.0 GB 32768 8192 100% GPU
qwen2.5vl:3b 7 34 537.538041ms 30.257916ms 21 token(s) 265.227333ms 79.18 tokens/s 10 token(s) 241.504875ms 41.41 tokens/s 3.8B 6.2 GB 128000 8192 100% GPU
qwen2.5vl:7b 7 36 893.769542ms 30.154834ms 21 token(s) 392.339667ms 53.53 tokens/s 10 token(s) 470.7725ms 21.24 tokens/s 8.3B 9.1 GB 128000 8192 100% GPU
qwen3:0.6b 8 41 1.292759042s 28.554ms 11 token(s) 90.813917ms 121.13 tokens/s 123 token(s) 1.172873167s 104.87 tokens/s 751.63M 2.3 GB 40960 8192 100% GPU
qwen3:1.7b 8 41 2.112287666s 26.238208ms 11 token(s) 136.211042ms 80.76 tokens/s 106 token(s) 1.9492395s 54.38 tokens/s 2.0B 3.0 GB 40960 8192 100% GPU
qwen3:14b 8 41 14.445739917s 26.11675ms 11 token(s) 3.192440333s 3.45 tokens/s 106 token(s) 11.222667417s 9.45 tokens/s 14.8B 12 GB 40960 8192 5%/95% CPU/GPU
qwen3:4b 8 41 8.568415208s 29.547333ms 11 token(s) 228.941542ms 48.05 tokens/s 229 token(s) 8.309245667s 27.56 tokens/s 4.0B 5.3 GB 40960 8192 100% GPU
qwen3:8b 20 102 10.300869875s 29.6745ms 11 token(s) 364.159666ms 30.21 tokens/s 168 token(s) 9.906456s 16.96 tokens/s 8.2B 7.6 GB 40960 8192 100% GPU
smollm2:1.7b 7 34 430.936208ms 17.824667ms 30 token(s) 198.989292ms 150.76 tokens/s 10 token(s) 213.495792ms 46.84 tokens/s 1.7B 4.7 GB 8192 8192 100% GPU
smollm2:135m 58 288 497.961292ms 18.285667ms 31 token(s) 44.659834ms 694.14 tokens/s 67 token(s) 434.418125ms 154.23 tokens/s 134.52M 1.2 GB 8192 8192 100% GPU
smollm2:360m 7 34 199.286958ms 18.7875ms 31 token(s) 76.008708ms 407.85 tokens/s 10 token(s) 103.826292ms 96.31 tokens/s 361.82M 1.9 GB 8192 8192 100% GPU


System
Ollama proc100% GPU
Ollama context8192
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory11G + 455M
Sys OSDarwin 24.5.0