ollama-multirun: hi: 20250713-160230

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:1 bytes:3

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 7 37 7.318774833s 12.206166ms 21 token(s) 1.798892375s 11.67 tokens/s 11 token(s) 5.505033833s 2.00 tokens/s 6.7B 14 GB 16384 32768 22%/78% CPU/GPU
cogito:3b 13 66 710.93625ms 29.87025ms 11 token(s) 255.17325ms 43.11 tokens/s 17 token(s) 425.314458ms 39.97 tokens/s 3.6B 8.2 GB 131072 32768 100% GPU
cogito:8b 7 34 1.939169875s 30.937667ms 11 token(s) 1.349918167s 8.15 tokens/s 10 token(s) 557.5675ms 17.94 tokens/s 8.0B 11 GB 131072 32768 6%/94% CPU/GPU
deepcoder:1.5b 8 41 393.594334ms 29.569584ms 4 token(s) 112.979042ms 35.40 tokens/s 16 token(s) 250.533958ms 63.86 tokens/s 1.8B 3.4 GB 131072 32768 100% GPU
deepseek-r1:1.5b 8 41 379.263542ms 28.160208ms 4 token(s) 108.77325ms 36.77 tokens/s 16 token(s) 241.740209ms 66.19 tokens/s 1.8B 3.4 GB 131072 32768 100% GPU
deepseek-r1:14b 8 41 4m16.726814125s 33.730917ms 4 token(s) 22.814853042s 0.18 tokens/s 16 token(s) 3m53.860393708s 0.07 tokens/s 14.8B 18 GB 131072 32768 39%/61% CPU/GPU
deepseek-r1:8b 8 41 15.651779667s 26.174834ms 3 token(s) 1.765074459s 1.70 tokens/s 193 token(s) 13.858930708s 13.93 tokens/s 8.2B 13 GB 131072 32768 17%/83% CPU/GPU
dolphin-mistral:7b 7 37 877.112333ms 14.698083ms 29 token(s) 432.431417ms 67.06 tokens/s 10 token(s) 429.358791ms 23.29 tokens/s 7.2B 11 GB 32768 32768 100% GPU
dolphin3:8b 7 34 2.297492583s 31.014708ms 24 token(s) 1.68611675s 14.23 tokens/s 10 token(s) 579.596125ms 17.25 tokens/s 8.0B 11 GB 131072 32768 6%/94% CPU/GPU
gemma3:1b 21 113 580.834667ms 52.913875ms 10 token(s) 83.686958ms 119.49 tokens/s 31 token(s) 443.713417ms 69.86 tokens/s 999.89M 2.1 GB 32768 32768 100% GPU
gemma3:4b 23 124 1.510115625s 51.803583ms 10 token(s) 223.249625ms 44.79 tokens/s 34 token(s) 1.234511459s 27.54 tokens/s 4.3B 6.5 GB 131072 32768 100% GPU
gemma3n:e2b 35 179 1.981714125s 54.058167ms 10 token(s) 350.426541ms 28.54 tokens/s 51 token(s) 1.576327667s 32.35 tokens/s 4.5B 6.7 GB 32768 32768 100% GPU
gemma3n:e4b 93 591 8.058955667s 67.282792ms 10 token(s) 1.138130791s 8.79 tokens/s 152 token(s) 6.852881917s 22.18 tokens/s 6.9B 8.3 GB 32768 32768 100% GPU
gemma:2b 15 71 579.041625ms 28.5505ms 23 token(s) 129.086958ms 178.17 tokens/s 21 token(s) 420.783875ms 49.91 tokens/s 2.5B 3.0 GB 8192 32768 100% GPU
granite3.3:2b 7 36 541.768083ms 18.728166ms 44 token(s) 318.882042ms 137.98 tokens/s 10 token(s) 203.409292ms 49.16 tokens/s 2.5B 6.7 GB 131072 32768 100% GPU
granite3.3:8b 32 173 5.2642965s 17.91925ms 44 token(s) 2.641407625s 16.66 tokens/s 38 token(s) 2.604155083s 14.59 tokens/s 8.2B 14 GB 131072 32768 24%/76% CPU/GPU
hermes3:8b 31 156 3.302221209s 30.288042ms 10 token(s) 1.176843042s 8.50 tokens/s 38 token(s) 2.093665708s 18.15 tokens/s 8.0B 11 GB 131072 32768 4%/96% CPU/GPU
llama3.1:8b-instruct-q4_1 17 83 2.422103833s 31.1585ms 11 token(s) 1.144180208s 9.61 tokens/s 21 token(s) 1.245195542s 16.86 tokens/s 8.0B 12 GB 131072 32768 6%/94% CPU/GPU
llama3.2:1b 15 74 497.795333ms 32.086708ms 26 token(s) 173.610958ms 149.76 tokens/s 18 token(s) 291.595125ms 61.73 tokens/s 1.2B 5.3 GB 131072 32768 100% GPU
llama3.2:3b 7 36 552.623417ms 31.309417ms 26 token(s) 267.727625ms 97.11 tokens/s 10 token(s) 252.955625ms 39.53 tokens/s 3.2B 8.2 GB 131072 32768 100% GPU
llava-llama3:8b 7 34 877.102791ms 33.316083ms 11 token(s) 345.162833ms 31.87 tokens/s 10 token(s) 497.937583ms 20.08 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 44 235 2.063754542s 14.184792ms 11 token(s) 309.357375ms 35.56 tokens/s 55 token(s) 1.739573s 31.62 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 27 144 3.164168s 12.906375ms 9 token(s) 1.229238666s 7.32 tokens/s 35 token(s) 1.92123675s 18.22 tokens/s 7.2B 11 GB 32768 32768 5%/95% CPU/GPU
minicpm-v:8b 19 98 1.523135625s 26.91625ms 9 token(s) 339.972208ms 26.47 tokens/s 24 token(s) 1.155715917s 20.77 tokens/s 7.6B 9.7 GB 32768 32768 100% GPU
mistral:7b 41 214 4.041069417s 14.776292ms 6 token(s) 1.704108875s 3.52 tokens/s 49 token(s) 2.321473833s 21.11 tokens/s 7.2B 11 GB 32768 32768 100% GPU
mistral:7b-instruct 46 274 3.208623625s 14.369875ms 5 token(s) 315.02775ms 15.87 tokens/s 59 token(s) 2.878426166s 20.50 tokens/s 7.2B 11 GB 32768 32768 100% GPU
qwen2.5-coder:7b 7 36 936.377958ms 28.124958ms 30 token(s) 432.975625ms 69.29 tokens/s 10 token(s) 474.693334ms 21.07 tokens/s 7.6B 9.0 GB 32768 32768 100% GPU
qwen2.5vl:3b 7 34 1.223491625s 31.164333ms 21 token(s) 921.240084ms 22.80 tokens/s 10 token(s) 270.489208ms 36.97 tokens/s 3.8B 8.4 GB 128000 32768 100% GPU
qwen2.5vl:7b 7 36 3.07941175s 28.616542ms 21 token(s) 2.54527175s 8.25 tokens/s 10 token(s) 504.85025ms 19.81 tokens/s 8.3B 12 GB 128000 32768 28%/72% CPU/GPU
qwen3:0.6b 8 41 1.059536125s 25.274917ms 11 token(s) 139.424916ms 78.90 tokens/s 83 token(s) 894.212334ms 92.82 tokens/s 751.63M 6.1 GB 40960 32768 100% GPU
qwen3:1.7b 7 36 1.987120291s 27.959833ms 11 token(s) 187.013792ms 58.82 tokens/s 94 token(s) 1.771676375s 53.06 tokens/s 2.0B 6.9 GB 40960 32768 100% GPU
qwen3:14b 6 44 14.8B 19 GB 40960 32768 43%/57% CPU/GPU
qwen3:4b 8 41 4.176501792s 25.886667ms 11 token(s) 337.120708ms 32.63 tokens/s 107 token(s) 3.812862209s 28.06 tokens/s 4.0B 11 GB 40960 32768 100% GPU
qwen3:8b 20 102 7.803055125s 27.824584ms 11 token(s) 1.71373825s 6.42 tokens/s 87 token(s) 6.060735542s 14.35 tokens/s 8.2B 13 GB 40960 32768 17%/83% CPU/GPU
smollm2:1.7b 7 36 398.137042ms 17.136583ms 30 token(s) 164.6305ms 182.23 tokens/s 10 token(s) 215.675042ms 46.37 tokens/s 1.7B 4.7 GB 8192 32768 100% GPU
smollm2:135m 7 34 131.097709ms 18.056209ms 31 token(s) 50.118209ms 618.54 tokens/s 10 token(s) 62.263583ms 160.61 tokens/s 134.52M 1.2 GB 8192 32768 100% GPU
smollm2:360m 7 34 203.504125ms 17.938958ms 31 token(s) 80.67375ms 384.26 tokens/s 10 token(s) 103.984ms 96.17 tokens/s 361.82M 1.9 GB 8192 32768 100% GPU


System
Ollama proc100% GPU
Ollama context32768
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory7638M + 1268M
Sys OSDarwin 24.5.0