ollama-multirun: hi: 20250713-163730

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:1 bytes:3

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 1 9 497.813917ms 14.659833ms 21 token(s) 344.875125ms 60.89 tokens/s 4 token(s) 137.478541ms 29.10 tokens/s 6.7B 6.9 GB 16384 4096 100% GPU
cogito:3b 7 34 469.960625ms 30.296667ms 11 token(s) 197.832459ms 55.60 tokens/s 10 token(s) 241.289791ms 41.44 tokens/s 3.6B 3.4 GB 131072 4096 100% GPU
cogito:8b 7 34 966.615084ms 31.19425ms 11 token(s) 379.475375ms 28.99 tokens/s 10 token(s) 555.347208ms 18.01 tokens/s 8.0B 6.2 GB 131072 4096 100% GPU
deepcoder:1.5b 8 41 374.796833ms 27.647541ms 4 token(s) 99.816875ms 40.07 tokens/s 16 token(s) 246.866541ms 64.81 tokens/s 1.8B 2.0 GB 131072 4096 100% GPU
deepseek-r1:1.5b 8 41 375.294583ms 27.099417ms 4 token(s) 99.749625ms 40.10 tokens/s 16 token(s) 247.906458ms 64.54 tokens/s 1.8B 2.0 GB 131072 4096 100% GPU
deepseek-r1:14b 8 41 6.202296791s 26.651625ms 4 token(s) 4.655969417s 0.86 tokens/s 16 token(s) 1.51909775s 10.53 tokens/s 14.8B 10 GB 131072 4096 100% GPU
deepseek-r1:8b 4 141 16.99610625s 28.172667ms 3 token(s) 856.395917ms 3.50 tokens/s 272 token(s) 16.110962458s 16.88 tokens/s 8.2B 6.6 GB 131072 4096 100% GPU
dolphin-mistral:7b 7 37 797.633042ms 14.916458ms 29 token(s) 352.46375ms 82.28 tokens/s 10 token(s) 429.576958ms 23.28 tokens/s 7.2B 5.6 GB 32768 4096 100% GPU
dolphin3:8b 7 36 927.930125ms 31.250584ms 24 token(s) 404.7735ms 59.29 tokens/s 10 token(s) 491.312083ms 20.35 tokens/s 8.0B 6.2 GB 131072 4096 100% GPU
gemma3:1b 19 104 654.974167ms 54.890959ms 10 token(s) 96.766292ms 103.34 tokens/s 29 token(s) 502.676667ms 57.69 tokens/s 999.89M 1.9 GB 32768 4096 100% GPU
gemma3:4b 23 124 1.544395209s 52.220334ms 10 token(s) 258.3655ms 38.70 tokens/s 34 token(s) 1.233234959s 27.57 tokens/s 4.3B 5.8 GB 131072 4096 100% GPU
gemma3n:e2b 27 148 1.58956625s 51.818792ms 10 token(s) 339.500791ms 29.46 tokens/s 38 token(s) 1.197678167s 31.73 tokens/s 4.5B 4.6 GB 32768 4096 100% GPU
gemma3n:e4b 40 215 3.511673666s 63.349875ms 10 token(s) 813.537917ms 12.29 tokens/s 58 token(s) 2.634209083s 22.02 tokens/s 6.9B 5.9 GB 32768 4096 100% GPU
gemma:2b 15 72 577.13675ms 30.15225ms 23 token(s) 133.882542ms 171.79 tokens/s 21 token(s) 412.570916ms 50.90 tokens/s 2.5B 2.9 GB 8192 4096 100% GPU
granite3.3:2b 7 36 439.64875ms 16.752791ms 44 token(s) 219.724584ms 200.25 tokens/s 10 token(s) 202.501916ms 49.38 tokens/s 2.5B 2.7 GB 131072 4096 100% GPU
granite3.3:8b 20 112 2.143904s 18.239125ms 44 token(s) 598.803584ms 73.48 tokens/s 25 token(s) 1.526170416s 16.38 tokens/s 8.2B 6.7 GB 131072 4096 100% GPU
hermes3:8b 27 138 2.203283041s 31.189333ms 10 token(s) 413.564875ms 24.18 tokens/s 34 token(s) 1.757992083s 19.34 tokens/s 8.0B 5.9 GB 131072 4096 100% GPU
llama3.1:8b-instruct-q4_1 17 83 1.664660958s 31.537ms 11 token(s) 395.692208ms 27.80 tokens/s 21 token(s) 1.236849209s 16.98 tokens/s 8.0B 6.3 GB 131072 4096 100% GPU
llama3.2:1b 15 74 436.7785ms 30.121125ms 26 token(s) 116.6285ms 222.93 tokens/s 18 token(s) 289.4215ms 62.19 tokens/s 1.2B 2.4 GB 131072 4096 100% GPU
llama3.2:3b 6 29 435.846917ms 31.99675ms 26 token(s) 211.301625ms 123.05 tokens/s 8 token(s) 191.934125ms 41.68 tokens/s 3.2B 3.4 GB 131072 4096 100% GPU
llava-llama3:8b 19 102 1.715882s 33.016416ms 11 token(s) 354.78375ms 31.00 tokens/s 24 token(s) 1.327556542s 18.08 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 35 174 1.671593292s 12.948709ms 11 token(s) 314.039542ms 35.03 tokens/s 42 token(s) 1.343985083s 31.25 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 7 36 813.578792ms 11.852792ms 9 token(s) 315.916ms 28.49 tokens/s 11 token(s) 485.060083ms 22.68 tokens/s 7.2B 6.2 GB 32768 4096 100% GPU
minicpm-v:8b 19 98 1.56258875s 30.092709ms 9 token(s) 299.036375ms 30.10 tokens/s 24 token(s) 1.232852083s 19.47 tokens/s 7.6B 6.4 GB 32768 4096 100% GPU
mistral:7b 26 137 2.065637708s 14.618042ms 6 token(s) 280.39025ms 21.40 tokens/s 35 token(s) 1.769933584s 19.77 tokens/s 7.2B 5.6 GB 32768 4096 100% GPU
mistral:7b-instruct 58 302 3.743591167s 14.958625ms 5 token(s) 215.478833ms 23.20 tokens/s 73 token(s) 3.512449959s 20.78 tokens/s 7.2B 5.6 GB 32768 4096 100% GPU
qwen2.5-coder:7b 26 134 2.093362334s 28.237625ms 30 token(s) 349.5695ms 85.82 tokens/s 31 token(s) 1.714970375s 18.08 tokens/s 7.6B 5.6 GB 32768 4096 100% GPU
qwen2.5vl:3b 7 34 513.824125ms 33.121209ms 21 token(s) 242.067208ms 86.75 tokens/s 10 token(s) 238.036667ms 42.01 tokens/s 3.8B 5.9 GB 128000 4096 100% GPU
qwen2.5vl:7b 7 36 882.426041ms 30.229041ms 21 token(s) 382.767542ms 54.86 tokens/s 10 token(s) 468.84975ms 21.33 tokens/s 8.3B 8.6 GB 128000 4096 100% GPU
qwen3:0.6b 8 41 1.006703834s 28.267709ms 11 token(s) 72.004083ms 152.77 tokens/s 100 token(s) 905.897375ms 110.39 tokens/s 751.63M 1.6 GB 40960 4096 100% GPU
qwen3:1.7b 8 41 1.6133555s 28.056458ms 11 token(s) 125.134792ms 87.91 tokens/s 79 token(s) 1.459613041s 54.12 tokens/s 2.0B 2.4 GB 40960 4096 100% GPU
qwen3:14b 8 41 13.764914708s 28.407791ms 11 token(s) 4.372116292s 2.52 tokens/s 92 token(s) 9.363718833s 9.83 tokens/s 14.8B 10 GB 40960 4096 100% GPU
qwen3:4b 8 41 5.392521541s 26.396375ms 11 token(s) 280.079375ms 39.27 tokens/s 141 token(s) 5.085552458s 27.73 tokens/s 4.0B 4.2 GB 40960 4096 100% GPU
qwen3:8b 8 41 8.055148792s 28.073459ms 11 token(s) 369.810833ms 29.74 tokens/s 131 token(s) 7.656716708s 17.11 tokens/s 8.2B 6.6 GB 40960 4096 100% GPU
smollm2:1.7b 18 100 680.658084ms 14.208167ms 30 token(s) 167.3705ms 179.24 tokens/s 22 token(s) 498.197333ms 44.16 tokens/s 1.7B 3.6 GB 8192 4096 100% GPU
smollm2:135m 47 266 435.152458ms 18.046417ms 31 token(s) 38.752958ms 799.94 tokens/s 62 token(s) 377.741875ms 164.13 tokens/s 134.52M 1.0 GB 8192 4096 100% GPU
smollm2:360m 11 57 264.989792ms 18.488167ms 31 token(s) 71.037625ms 436.39 tokens/s 16 token(s) 174.809791ms 91.53 tokens/s 361.82M 1.6 GB 8192 4096 100% GPU


System
Ollama proc100% GPU
Ollama context4096
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory11G + 887M
Sys OSDarwin 24.5.0