ollama-multirun: hi: 20250713-152352

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:1 bytes:3

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 7 37 7.309737541s 11.341ms 21 token(s) 1.921307542s 10.93 tokens/s 11 token(s) 5.373215375s 2.05 tokens/s 6.7B 14 GB 16384 65536 22%/78% CPU/GPU
cogito:3b 7 34 1.740782125s 29.215917ms 11 token(s) 1.261909375s 8.72 tokens/s 10 token(s) 448.8695ms 22.28 tokens/s 3.6B 13 GB 131072 65536 19%/81% CPU/GPU
cogito:8b 7 34 16.729648166s 31.211416ms 11 token(s) 2.761316084s 3.98 tokens/s 10 token(s) 13.932685833s 0.72 tokens/s 8.0B 18 GB 131072 65536 39%/61% CPU/GPU
deepcoder:1.5b 8 41 530.801ms 27.455625ms 4 token(s) 148.705167ms 26.90 tokens/s 16 token(s) 354.085375ms 45.19 tokens/s 1.8B 5.2 GB 131072 65536 100% GPU
deepseek-r1:1.5b 8 41 407.869709ms 28.307125ms 4 token(s) 138.396958ms 28.90 tokens/s 16 token(s) 240.608083ms 66.50 tokens/s 1.8B 5.2 GB 131072 65536 100% GPU
deepseek-r1:14b 8 41 4m47.48131575s 26.51325ms 4 token(s) 14.337323208s 0.28 tokens/s 16 token(s) 4m33.104934875s 0.06 tokens/s 14.8B 27 GB 131072 65536 59%/41% CPU/GPU
deepseek-r1:8b 4 72 1m12.251077459s 32.131542ms 3 token(s) 3.165768208s 0.95 tokens/s 208 token(s) 1m9.049341167s 3.01 tokens/s 8.2B 21 GB 131072 65536 48%/52% CPU/GPU
dolphin-mistral:7b 7 37 880.136458ms 13.437042ms 29 token(s) 428.888958ms 67.62 tokens/s 10 token(s) 437.092042ms 22.88 tokens/s 7.2B 11 GB 32768 65536 100% GPU
dolphin3:8b 7 34 13.58731s 34.307958ms 24 token(s) 2.948011917s 8.14 tokens/s 10 token(s) 10.599175333s 0.94 tokens/s 8.0B 18 GB 131072 65536 39%/61% CPU/GPU
gemma3:1b 21 109 744.9395ms 51.957666ms 10 token(s) 90.721ms 110.23 tokens/s 31 token(s) 601.734208ms 51.52 tokens/s 999.89M 2.1 GB 32768 65536 100% GPU
gemma3:4b 43 262 2.317934667s 52.243375ms 10 token(s) 282.760542ms 35.37 tokens/s 60 token(s) 1.9823245s 30.27 tokens/s 4.3B 7.8 GB 131072 65536 100% GPU
gemma3n:e2b 30 166 1.798621375s 52.558291ms 10 token(s) 299.000709ms 33.44 tokens/s 46 token(s) 1.446518625s 31.80 tokens/s 4.5B 6.7 GB 32768 65536 100% GPU
gemma3n:e4b 71 445 6.407183875s 51.019042ms 10 token(s) 773.02575ms 12.94 tokens/s 124 token(s) 5.58261s 22.21 tokens/s 6.9B 8.3 GB 32768 65536 100% GPU
gemma:2b 15 68 574.117917ms 30.510083ms 23 token(s) 129.751042ms 177.26 tokens/s 21 token(s) 413.340833ms 50.81 tokens/s 2.5B 3.0 GB 8192 65536 100% GPU
granite3.3:2b 0 0 2.5B 11 GB 131072 65536 100% GPU
granite3.3:8b 16 89 45.141988708s 13.080208ms 44 token(s) 4.565584834s 9.64 tokens/s 21 token(s) 40.559959875s 0.52 tokens/s 8.2B 23 GB 131072 65536 53%/47% CPU/GPU
hermes3:8b 27 138 24.264792708s 30.54525ms 10 token(s) 1.901765875s 5.26 tokens/s 34 token(s) 22.32780225s 1.52 tokens/s 8.0B 18 GB 131072 65536 39%/61% CPU/GPU
llama3.1:8b-instruct-q4_1 17 83 51.654558s 32.409458ms 11 token(s) 2.129417958s 5.17 tokens/s 21 token(s) 49.483365458s 0.42 tokens/s 8.0B 18 GB 131072 65536 41%/59% CPU/GPU
llama3.2:1b 15 74 624.788875ms 29.985041ms 26 token(s) 264.679792ms 98.23 tokens/s 18 token(s) 329.4825ms 54.63 tokens/s 1.2B 8.6 GB 131072 65536 100% GPU
llama3.2:3b 6 27 1.773693292s 30.544042ms 26 token(s) 1.353082208s 19.22 tokens/s 8 token(s) 389.245667ms 20.55 tokens/s 3.2B 13 GB 131072 65536 19%/81% CPU/GPU
llava-llama3:8b 7 34 916.476ms 30.556125ms 11 token(s) 342.036958ms 32.16 tokens/s 10 token(s) 543.397417ms 18.40 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 59 297 2.708403542s 7.662292ms 11 token(s) 261.366917ms 42.09 tokens/s 72 token(s) 2.438946416s 29.52 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 17 90 2.371292875s 12.264041ms 9 token(s) 1.178164s 7.64 tokens/s 22 token(s) 1.1801555s 18.64 tokens/s 7.2B 11 GB 32768 65536 5%/95% CPU/GPU
minicpm-v:8b 19 98 1.744411917s 27.714083ms 9 token(s) 363.045917ms 24.79 tokens/s 24 token(s) 1.35306725s 17.74 tokens/s 7.6B 9.7 GB 32768 65536 100% GPU
mistral:7b 42 229 4.747577667s 15.93275ms 6 token(s) 1.907520417s 3.15 tokens/s 54 token(s) 2.82342525s 19.13 tokens/s 7.2B 11 GB 32768 65536 100% GPU
mistral:7b-instruct 63 336 4.916012458s 14.876667ms 5 token(s) 382.868125ms 13.06 tokens/s 83 token(s) 4.517585s 18.37 tokens/s 7.2B 11 GB 32768 65536 100% GPU
qwen2.5-coder:7b 7 36 1.122770584s 27.414042ms 30 token(s) 579.547208ms 51.76 tokens/s 10 token(s) 515.099125ms 19.41 tokens/s 7.6B 9.0 GB 32768 65536 100% GPU
qwen2.5vl:3b 7 34 2.142355583s 29.745ms 21 token(s) 1.8427445s 11.40 tokens/s 10 token(s) 269.285458ms 37.14 tokens/s 3.8B 11 GB 128000 65536 100% GPU
qwen2.5vl:7b 0 0 8.3B 128000
qwen3:0.6b 8 41 1.477805459s 26.461875ms 11 token(s) 185.229458ms 59.39 tokens/s 132 token(s) 1.265550084s 104.30 tokens/s 751.63M 7.4 GB 40960 65536 100% GPU
qwen3:1.7b 8 41 3.16612525s 27.420875ms 11 token(s) 231.464ms 47.52 tokens/s 128 token(s) 2.906553292s 44.04 tokens/s 2.0B 8.2 GB 40960 65536 100% GPU
qwen3:14b 6 44 14.8B 22 GB 40960 65536 48%/52% CPU/GPU
qwen3:4b 8 41 5.7548455s 26.076334ms 11 token(s) 887.559625ms 12.39 tokens/s 95 token(s) 4.840550042s 19.63 tokens/s 4.0B 13 GB 40960 65536 16%/84% CPU/GPU
qwen3:8b 8 41 40.943485083s 28.261917ms 11 token(s) 2.051716833s 5.36 tokens/s 100 token(s) 38.859804792s 2.57 tokens/s 8.2B 15 GB 40960 65536 29%/71% CPU/GPU
smollm2:1.7b 7 36 425.299792ms 18.228875ms 30 token(s) 170.040958ms 176.43 tokens/s 10 token(s) 236.203792ms 42.34 tokens/s 1.7B 4.7 GB 8192 65536 100% GPU
smollm2:135m 7 34 136.103542ms 18.848667ms 31 token(s) 50.467084ms 614.26 tokens/s 10 token(s) 66.044416ms 151.41 tokens/s 134.52M 1.2 GB 8192 65536 100% GPU
smollm2:360m 7 34 206.924167ms 16.071584ms 31 token(s) 78.222833ms 396.30 tokens/s 10 token(s) 111.967292ms 89.31 tokens/s 361.82M 1.9 GB 8192 65536 100% GPU


System
Ollama proc100% GPU
Ollama context65536
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory8097M + 770M
Sys OSDarwin 24.5.0