ollama-multirun: hi: 20250713-133817

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:1 bytes:3

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 1 9 3.33205325s 13.606167ms 21 token(s) 1.738607958s 12.08 tokens/s 4 token(s) 1.577982084s 2.53 tokens/s 6.7B 14 GB 16384 131072 22%/78% CPU/GPU
cogito:3b 7 34 8.5515675s 39.751208ms 11 token(s) 1.86826875s 5.89 tokens/s 10 token(s) 6.642460458s 1.51 tokens/s 3.6B 24 GB 131072 131072 56%/44% CPU/GPU
cogito:8b 0 0 8.0B 131072
deepcoder:1.5b 8 41 532.917875ms 28.117875ms 4 token(s) 258.6165ms 15.47 tokens/s 16 token(s) 245.670958ms 65.13 tokens/s 1.8B 8.9 GB 131072 131072 100% GPU
deepseek-r1:1.5b 8 41 525.881417ms 26.917375ms 4 token(s) 252.666958ms 15.83 tokens/s 16 token(s) 245.607542ms 65.14 tokens/s 1.8B 8.9 GB 131072 131072 100% GPU
deepseek-r1:14b 8 41 8.527335208s 26.985666ms 4 token(s) 6.268131875s 0.64 tokens/s 16 token(s) 2.231549792s 7.17 tokens/s 14.8B 34 GB 131072 131072 100% CPU
deepseek-r1:8b 9 43 14.040474791s 27.507083ms 3 token(s) 583.717708ms 5.14 tokens/s 166 token(s) 13.428613458s 12.36 tokens/s 8.2B 24 GB 131072 131072 100% CPU
dolphin-mistral:7b 7 37 1.650124709s 15.256292ms 29 token(s) 1.13607825s 25.53 tokens/s 10 token(s) 497.67725ms 20.09 tokens/s 7.2B 11 GB 32768 131072 100% GPU
dolphin3:8b 0 0 8.0B 131072
gemma3:1b 23 121 599.863292ms 51.481708ms 10 token(s) 85.871ms 116.45 tokens/s 33 token(s) 461.831584ms 71.45 tokens/s 999.89M 2.1 GB 32768 131072 100% GPU
gemma3:4b 49 280 3.149748333s 53.2105ms 10 token(s) 595.500333ms 16.79 tokens/s 74 token(s) 2.500543125s 29.59 tokens/s 4.3B 10 GB 131072 131072 100% GPU
gemma3n:e2b 26 140 2.080287625s 53.687709ms 10 token(s) 776.510375ms 12.88 tokens/s 38 token(s) 1.248489292s 30.44 tokens/s 4.5B 6.7 GB 32768 131072 100% GPU
gemma3n:e4b 25 136 2.96821875s 54.37525ms 10 token(s) 1.244132708s 8.04 tokens/s 38 token(s) 1.667803167s 22.78 tokens/s 6.9B 8.3 GB 32768 131072 100% GPU
gemma:2b 16 72 593.817583ms 30.766708ms 23 token(s) 127.855208ms 179.89 tokens/s 22 token(s) 434.647834ms 50.62 tokens/s 2.5B 3.0 GB 8192 131072 100% GPU
granite3.3:2b 0 0 2.5B 131072
granite3.3:8b 25 140 7.559457166s 18.633ms 44 token(s) 4.883007667s 9.01 tokens/s 33 token(s) 2.657159s 12.42 tokens/s 8.2B 26 GB 131072 131072 100% CPU
hermes3:8b 0 0 8.0B 131072
llama3.1:8b-instruct-q4_1 0 0 8.0B 131072
llama3.2:1b 0 0 1.2B 131072
llama3.2:3b 7 36 3.055374s 31.883333ms 26 token(s) 2.293372833s 11.34 tokens/s 10 token(s) 729.4025ms 13.71 tokens/s 3.2B 24 GB 131072 131072 56%/44% CPU/GPU
llava-llama3:8b 23 123 1.974579416s 32.9305ms 11 token(s) 345.309958ms 31.86 tokens/s 27 token(s) 1.595738417s 16.92 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 42 210 2.04440925s 13.461208ms 11 token(s) 252.911334ms 43.49 tokens/s 52 token(s) 1.777358791s 29.26 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 30 163 3.305790084s 12.141084ms 9 token(s) 1.177221208s 7.65 tokens/s 38 token(s) 2.115605875s 17.96 tokens/s 7.2B 11 GB 32768 131072 5%/95% CPU/GPU
minicpm-v:8b 19 98 1.7125485s 26.075375ms 9 token(s) 362.210208ms 24.85 tokens/s 24 token(s) 1.323743042s 18.13 tokens/s 7.6B 9.7 GB 32768 131072 100% GPU
mistral:7b 92 512 9.096838667s 14.043208ms 6 token(s) 1.984108125s 3.02 tokens/s 135 token(s) 7.0979365s 19.02 tokens/s 7.2B 11 GB 32768 131072 100% GPU
mistral:7b-instruct 56 300 4.37924725s 13.955667ms 5 token(s) 365.501792ms 13.68 tokens/s 75 token(s) 3.999080916s 18.75 tokens/s 7.2B 11 GB 32768 131072 100% GPU
qwen2.5-coder:7b 7 36 1.03598525s 26.605042ms 30 token(s) 486.162167ms 61.71 tokens/s 10 token(s) 522.6945ms 19.13 tokens/s 7.6B 9.0 GB 32768 131072 100% GPU
qwen2.5vl:3b 0 0 3.8B 128000
qwen2.5vl:7b 0 0 8.3B 128000
qwen3:0.6b 8 38 1.367429459s 28.195209ms 11 token(s) 193.224ms 56.93 tokens/s 121 token(s) 1.1454755s 105.63 tokens/s 751.63M 7.4 GB 40960 131072 100% GPU
qwen3:1.7b 8 41 1.744778875s 27.449792ms 11 token(s) 283.187041ms 38.84 tokens/s 78 token(s) 1.433540875s 54.41 tokens/s 2.0B 8.2 GB 40960 131072 100% GPU
qwen3:14b 6 44 14.8B 22 GB 40960 131072 48%/52% CPU/GPU
qwen3:4b 8 41 8.830962417s 26.88ms 11 token(s) 822.504333ms 13.37 tokens/s 182 token(s) 7.980914584s 22.80 tokens/s 4.0B 13 GB 40960 131072 16%/84% CPU/GPU
qwen3:8b 8 41 9.142129084s 26.981667ms 11 token(s) 1.740455833s 6.32 tokens/s 94 token(s) 7.374071s 12.75 tokens/s 8.2B 15 GB 40960 131072 29%/71% CPU/GPU
smollm2:1.7b 7 34 403.90575ms 13.537167ms 30 token(s) 159.675958ms 187.88 tokens/s 10 token(s) 230.102125ms 43.46 tokens/s 1.7B 4.7 GB 8192 131072 100% GPU
smollm2:135m 7 31 120.587209ms 16.863959ms 31 token(s) 45.770541ms 677.29 tokens/s 10 token(s) 57.4175ms 174.16 tokens/s 134.52M 1.2 GB 8192 131072 100% GPU
smollm2:360m 25 146 472.410916ms 16.542291ms 31 token(s) 67.566ms 458.81 tokens/s 33 token(s) 387.743292ms 85.11 tokens/s 361.82M 1.9 GB 8192 131072 100% GPU


System
Ollama proc100% GPU
Ollama context131072
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory6978M + 401M
Sys OSDarwin 24.5.0