ollama-multirun: hi: 20250713-162732

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:1 bytes:3

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 7 37 7.496044041s 8.195916ms 21 token(s) 1.865782666s 11.26 tokens/s 11 token(s) 5.619758875s 1.96 tokens/s 6.7B 14 GB 16384 16384 22%/78% CPU/GPU
cogito:3b 7 34 480.737875ms 30.935584ms 11 token(s) 208.224416ms 52.83 tokens/s 10 token(s) 240.957667ms 41.50 tokens/s 3.6B 5.4 GB 131072 16384 100% GPU
cogito:8b 7 34 914.160708ms 32.101375ms 11 token(s) 389.816083ms 28.22 tokens/s 10 token(s) 491.644208ms 20.34 tokens/s 8.0B 8.6 GB 131072 16384 100% GPU
deepcoder:1.5b 8 41 366.87475ms 28.482667ms 4 token(s) 86.900458ms 46.03 tokens/s 16 token(s) 251.012125ms 63.74 tokens/s 1.8B 2.5 GB 131072 16384 100% GPU
deepseek-r1:1.5b 8 41 361.014625ms 28.12925ms 4 token(s) 85.487167ms 46.79 tokens/s 16 token(s) 246.916542ms 64.80 tokens/s 1.8B 2.5 GB 131072 16384 100% GPU
deepseek-r1:14b 8 41 11.714486s 28.601541ms 4 token(s) 5.605166875s 0.71 tokens/s 16 token(s) 6.078884958s 2.63 tokens/s 14.8B 13 GB 131072 16384 18%/82% CPU/GPU
deepseek-r1:8b 8 41 12.454995792s 26.733042ms 3 token(s) 275.817666ms 10.88 tokens/s 204 token(s) 12.151929625s 16.79 tokens/s 8.2B 9.6 GB 131072 16384 100% GPU
dolphin-mistral:7b 19 105 1.736945125s 14.50725ms 29 token(s) 414.698625ms 69.93 tokens/s 25 token(s) 1.306735625s 19.13 tokens/s 7.2B 8.0 GB 32768 16384 100% GPU
dolphin3:8b 7 34 1.044578041s 30.931125ms 24 token(s) 515.473041ms 46.56 tokens/s 10 token(s) 497.539709ms 20.10 tokens/s 8.0B 8.6 GB 131072 16384 100% GPU
gemma3:1b 21 112 584.507084ms 52.481792ms 10 token(s) 88.276125ms 113.28 tokens/s 31 token(s) 443.148958ms 69.95 tokens/s 999.89M 2.0 GB 32768 16384 100% GPU
gemma3:4b 43 262 2.43751775s 52.602917ms 10 token(s) 263.277208ms 37.98 tokens/s 62 token(s) 2.121030625s 29.23 tokens/s 4.3B 6.1 GB 131072 16384 100% GPU
gemma3n:e2b 30 162 1.761467958s 57.79825ms 10 token(s) 324.980625ms 30.77 tokens/s 44 token(s) 1.378155167s 31.93 tokens/s 4.5B 5.3 GB 32768 16384 100% GPU
gemma3n:e4b 23 126 2.611281625s 51.955833ms 10 token(s) 1.050937416s 9.52 tokens/s 33 token(s) 1.507523042s 21.89 tokens/s 6.9B 6.8 GB 32768 16384 100% GPU
gemma:2b 15 71 574.93525ms 31.615958ms 23 token(s) 124.421208ms 184.86 tokens/s 21 token(s) 418.323375ms 50.20 tokens/s 2.5B 3.0 GB 8192 16384 100% GPU
granite3.3:2b 7 36 485.900209ms 19.335584ms 44 token(s) 260.444416ms 168.94 tokens/s 10 token(s) 205.377834ms 48.69 tokens/s 2.5B 4.4 GB 131072 16384 100% GPU
granite3.3:8b 25 140 2.725187291s 17.035416ms 44 token(s) 687.988792ms 63.95 tokens/s 33 token(s) 2.019495125s 16.34 tokens/s 8.2B 10 GB 131072 16384 100% GPU
hermes3:8b 17 87 1.629426416s 32.045666ms 10 token(s) 421.030334ms 23.75 tokens/s 22 token(s) 1.17577625s 18.71 tokens/s 8.0B 8.4 GB 131072 16384 100% GPU
llama3.1:8b-instruct-q4_1 17 83 1.698175417s 32.823375ms 11 token(s) 415.571292ms 26.47 tokens/s 21 token(s) 1.249219541s 16.81 tokens/s 8.0B 8.8 GB 131072 16384 100% GPU
llama3.2:1b 7 36 300.101583ms 30.591458ms 26 token(s) 115.865458ms 224.40 tokens/s 10 token(s) 153.003375ms 65.36 tokens/s 1.2B 3.6 GB 131072 16384 100% GPU
llama3.2:3b 7 36 530.554792ms 31.272459ms 26 token(s) 256.816833ms 101.24 tokens/s 10 token(s) 241.907042ms 41.34 tokens/s 3.2B 5.4 GB 131072 16384 100% GPU
llava-llama3:8b 28 149 2.314740958s 32.380958ms 11 token(s) 379.988625ms 28.95 tokens/s 34 token(s) 1.901847125s 17.88 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 62 313 2.504062208s 13.997583ms 11 token(s) 249.314583ms 44.12 tokens/s 73 token(s) 2.240164042s 32.59 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 18 99 1.609110042s 13.93075ms 9 token(s) 447.122625ms 20.13 tokens/s 23 token(s) 1.147424416s 20.04 tokens/s 7.2B 8.7 GB 32768 16384 100% GPU
minicpm-v:8b 41 199 2.731726417s 28.0365ms 9 token(s) 369.760208ms 24.34 tokens/s 48 token(s) 2.333350917s 20.57 tokens/s 7.6B 7.8 GB 32768 16384 100% GPU
mistral:7b 60 309 4.0915575s 15.199ms 6 token(s) 326.919625ms 18.35 tokens/s 76 token(s) 3.74871625s 20.27 tokens/s 7.2B 8.1 GB 32768 16384 100% GPU
mistral:7b-instruct 26 137 2.00504425s 15.8785ms 5 token(s) 246.239125ms 20.31 tokens/s 35 token(s) 1.742315583s 20.09 tokens/s 7.2B 8.1 GB 32768 16384 100% GPU
qwen2.5-coder:7b 7 36 938.407625ms 26.62725ms 30 token(s) 388.386792ms 77.24 tokens/s 10 token(s) 522.770666ms 19.13 tokens/s 7.6B 7.0 GB 32768 16384 100% GPU
qwen2.5vl:3b 7 34 539.373916ms 30.924458ms 21 token(s) 268.499916ms 78.21 tokens/s 10 token(s) 239.425834ms 41.77 tokens/s 3.8B 6.9 GB 128000 16384 100% GPU
qwen2.5vl:7b 7 36 2.811726875s 30.929542ms 21 token(s) 2.309022125s 9.09 tokens/s 10 token(s) 471.186167ms 21.22 tokens/s 8.3B 10 GB 128000 16384 100% GPU
qwen3:0.6b 8 38 1.015120333s 28.303875ms 11 token(s) 112.278416ms 97.97 tokens/s 96 token(s) 874.0015ms 109.84 tokens/s 751.63M 3.5 GB 40960 16384 100% GPU
qwen3:1.7b 8 41 1.511045375s 27.688166ms 11 token(s) 153.132125ms 71.83 tokens/s 72 token(s) 1.329386625s 54.16 tokens/s 2.0B 4.3 GB 40960 16384 100% GPU
qwen3:14b 8 41 19.372382208s 24.952916ms 11 token(s) 3.56774825s 3.08 tokens/s 107 token(s) 15.778352125s 6.78 tokens/s 14.8B 14 GB 40960 16384 23%/77% CPU/GPU
qwen3:4b 8 41 6.125246042s 28.184709ms 11 token(s) 263.891583ms 41.68 tokens/s 162 token(s) 5.83259925s 27.77 tokens/s 4.0B 7.3 GB 40960 16384 100% GPU
qwen3:8b 8 41 4.819035209s 28.02775ms 11 token(s) 415.964292ms 26.44 tokens/s 74 token(s) 4.374498s 16.92 tokens/s 8.2B 9.6 GB 40960 16384 100% GPU
smollm2:1.7b 7 36 421.675333ms 15.736041ms 30 token(s) 191.923458ms 156.31 tokens/s 10 token(s) 213.384417ms 46.86 tokens/s 1.7B 4.7 GB 8192 16384 100% GPU
smollm2:135m 7 34 130.9035ms 18.733708ms 31 token(s) 49.476083ms 626.57 tokens/s 10 token(s) 62.066542ms 161.12 tokens/s 134.52M 1.2 GB 8192 16384 100% GPU
smollm2:360m 140 861 2.043382625s 17.929834ms 31 token(s) 82.7325ms 374.70 tokens/s 159 token(s) 1.942111458s 81.87 tokens/s 361.82M 1.9 GB 8192 16384 100% GPU


System
Ollama proc100% GPU
Ollama context16384
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory10G + 561M
Sys OSDarwin 24.5.0