ollama-multirun: how_many_r_s_in_the_word__strawberry__: 20250713-131958

models: codellama:7b cogito:3b cogito:8b deepcoder:1.5b deepseek-r1:1.5b deepseek-r1:14b deepseek-r1:8b dolphin-mistral:7b dolphin3:8b gemma3:1b gemma3:4b gemma3n:e2b gemma3n:e4b gemma:2b granite3.3:2b granite3.3:8b hermes3:8b llama3.1:8b-instruct-q4_1 llama3.2:1b llama3.2:3b llava-llama3:8b llava-phi3:3.8b llava:7b minicpm-v:8b mistral:7b mistral:7b-instruct qwen2.5-coder:7b qwen2.5vl:3b qwen2.5vl:7b qwen3:0.6b qwen3:1.7b qwen3:14b qwen3:4b qwen3:8b smollm2:1.7b smollm2:135m smollm2:360m

Prompt: (raw) (yaml) words:7 bytes:39

Model Response
words
Response
bytes
Total
duration
Load
duration
Prompt eval
count
Prompt eval
duration
Prompt eval
rate
Eval
count
Eval
duration
Eval
rate
Model
params
Model
size
Model
context
Ollama
context
Ollama
proc
codellama:7b 8 44 1.275854708s 15.987125ms 33 token(s) 510.789416ms 64.61 tokens/s 16 token(s) 748.40125ms 21.38 tokens/s 6.7B 9.4 GB 16384 8192 100% GPU
cogito:3b 15 123 1.46703175s 29.09475ms 22 token(s) 175.958875ms 125.03 tokens/s 44 token(s) 1.261389792s 34.88 tokens/s 3.6B 4.0 GB 131072 8192 100% GPU
cogito:8b 6 39 1.212427792s 30.573417ms 22 token(s) 465.8525ms 47.23 tokens/s 14 token(s) 715.328958ms 19.57 tokens/s 8.0B 7.0 GB 131072 8192 100% GPU
deepcoder:1.5b 15 103 8.294132167s 28.792042ms 15 token(s) 108.045333ms 138.83 tokens/s 491 token(s) 8.15673825s 60.20 tokens/s 1.8B 2.1 GB 131072 8192 100% GPU
deepseek-r1:1.5b 111 630 6.41216075s 28.874875ms 15 token(s) 122.494084ms 122.45 tokens/s 380 token(s) 6.260248208s 60.70 tokens/s 1.8B 2.1 GB 131072 8192 100% GPU
deepseek-r1:14b 64 355 28.724929458s 26.45325ms 15 token(s) 2.241454916s 6.69 tokens/s 251 token(s) 26.455664917s 9.49 tokens/s 14.8B 11 GB 131072 8192 6%/94% CPU/GPU
deepseek-r1:8b 60 379 4m53.853726041s 29.164583ms 14 token(s) 359.23225ms 38.97 tokens/s 3920 token(s) 4m53.464724167s 13.36 tokens/s 8.2B 7.6 GB 131072 8192 100% GPU
dolphin-mistral:7b 8 44 1.414776417s 16.384084ms 41 token(s) 654.927333ms 62.60 tokens/s 15 token(s) 742.828ms 20.19 tokens/s 7.2B 6.4 GB 32768 8192 100% GPU
dolphin3:8b 8 43 2.191481042s 30.844042ms 35 token(s) 1.350368375s 25.92 tokens/s 15 token(s) 809.6485ms 18.53 tokens/s 8.0B 7.0 GB 131072 8192 100% GPU
gemma3:1b 8 52 371.714458ms 57.177791ms 20 token(s) 92.540333ms 216.12 tokens/s 16 token(s) 221.4835ms 72.24 tokens/s 999.89M 2.0 GB 32768 8192 100% GPU
gemma3:4b 8 52 827.438917ms 52.855292ms 20 token(s) 264.994708ms 75.47 tokens/s 16 token(s) 509.007917ms 31.43 tokens/s 4.3B 5.9 GB 131072 8192 100% GPU
gemma3n:e2b 23 153 2.048573375s 54.353166ms 22 token(s) 261.032167ms 84.28 tokens/s 56 token(s) 1.732406958s 32.32 tokens/s 4.5B 4.8 GB 32768 8192 100% GPU
gemma3n:e4b 8 50 2.094495167s 53.691917ms 22 token(s) 1.132075583s 19.43 tokens/s 20 token(s) 908.155125ms 22.02 tokens/s 6.9B 6.2 GB 32768 8192 100% GPU
gemma:2b 18 97 852.230625ms 30.939042ms 33 token(s) 162.488709ms 203.09 tokens/s 33 token(s) 658.245375ms 50.13 tokens/s 2.5B 3.0 GB 8192 8192 100% GPU
granite3.3:2b 10 69 618.233ms 19.221875ms 55 token(s) 239.818667ms 229.34 tokens/s 17 token(s) 358.519916ms 47.42 tokens/s 2.5B 3.3 GB 131072 8192 100% GPU
granite3.3:8b 8 48 1.547848375s 22.504583ms 55 token(s) 649.063042ms 84.74 tokens/s 15 token(s) 875.525583ms 17.13 tokens/s 8.2B 7.9 GB 131072 8192 100% GPU
hermes3:8b 8 47 1.097599209s 31.224625ms 21 token(s) 378.580208ms 55.47 tokens/s 14 token(s) 687.235125ms 20.37 tokens/s 8.0B 6.7 GB 131072 8192 100% GPU
llama3.1:8b-instruct-q4_1 8 43 1.988790708s 35.232ms 22 token(s) 1.007706791s 21.83 tokens/s 15 token(s) 945.166334ms 15.87 tokens/s 8.0B 7.2 GB 131072 8192 100% GPU
llama3.2:1b 8 43 432.818084ms 32.411875ms 37 token(s) 140.530792ms 263.29 tokens/s 15 token(s) 259.231542ms 57.86 tokens/s 1.2B 2.8 GB 131072 8192 100% GPU
llama3.2:3b 8 43 722.350291ms 31.565958ms 37 token(s) 293.051667ms 126.26 tokens/s 15 token(s) 397.11175ms 37.77 tokens/s 3.2B 4.0 GB 131072 8192 100% GPU
llava-llama3:8b 8 46 1.335298333s 32.757291ms 24 token(s) 368.959ms 65.05 tokens/s 15 token(s) 932.68975ms 16.08 tokens/s 8.0B 6.8 GB 8192 4096 100% GPU
llava-phi3:3.8b 30 165 2.077716125s 15.39725ms 23 token(s) 349.324875ms 65.84 tokens/s 50 token(s) 1.712277416s 29.20 tokens/s 3.8B 5.4 GB 4096 4096 100% GPU
llava:7b 8 48 1.269078417s 15.465959ms 21 token(s) 472.957917ms 44.40 tokens/s 16 token(s) 779.879875ms 20.52 tokens/s 7.2B 7.0 GB 32768 8192 100% GPU
minicpm-v:8b 8 42 1.505774458s 28.229458ms 20 token(s) 560.415709ms 35.69 tokens/s 16 token(s) 916.602666ms 17.46 tokens/s 7.6B 6.8 GB 32768 8192 100% GPU
mistral:7b 9 53 1.409752291s 17.062708ms 18 token(s) 467.236417ms 38.52 tokens/s 18 token(s) 924.629125ms 19.47 tokens/s 7.2B 6.4 GB 32768 8192 100% GPU
mistral:7b-instruct 9 51 1.121227125s 16.57025ms 17 token(s) 326.959459ms 51.99 tokens/s 16 token(s) 776.979666ms 20.59 tokens/s 7.2B 6.4 GB 32768 8192 100% GPU
qwen2.5-coder:7b 6 40 1.509367209s 27.933959ms 41 token(s) 572.989667ms 71.55 tokens/s 15 token(s) 907.753167ms 16.52 tokens/s 7.6B 6.0 GB 32768 8192 100% GPU
qwen2.5vl:3b 8 43 733.198417ms 30.653208ms 32 token(s) 295.706ms 108.22 tokens/s 15 token(s) 406.281458ms 36.92 tokens/s 3.8B 6.2 GB 128000 8192 100% GPU
qwen2.5vl:7b 6 42 3.011269833s 30.196458ms 32 token(s) 2.238150959s 14.30 tokens/s 14 token(s) 742.409541ms 18.86 tokens/s 8.3B 9.1 GB 128000 8192 100% GPU
qwen3:0.6b 12 81 3.828666625s 30.661708ms 22 token(s) 86.412792ms 254.59 tokens/s 357 token(s) 3.711006375s 96.20 tokens/s 751.63M 2.3 GB 40960 8192 100% GPU
qwen3:1.7b 67 431 12.36374325s 28.635834ms 22 token(s) 127.750667ms 172.21 tokens/s 629 token(s) 12.206742708s 51.53 tokens/s 2.0B 3.0 GB 40960 8192 100% GPU
qwen3:14b 74 442 1m38.397041125s 25.883583ms 22 token(s) 5.190262833s 4.24 tokens/s 780 token(s) 1m33.180101209s 8.37 tokens/s 14.8B 12 GB 40960 8192 5%/95% CPU/GPU
qwen3:4b 42 212 15.839626375s 28.034458ms 22 token(s) 239.629167ms 91.81 tokens/s 388 token(s) 15.571264875s 24.92 tokens/s 4.0B 5.3 GB 40960 8192 100% GPU
qwen3:8b 67 388 1m28.467323916s 28.754375ms 22 token(s) 384.27575ms 57.25 tokens/s 1309 token(s) 1m28.05370325s 14.87 tokens/s 8.2B 7.6 GB 40960 8192 100% GPU
smollm2:1.7b 8 44 611.793ms 21.263167ms 41 token(s) 235.373125ms 174.19 tokens/s 15 token(s) 354.36275ms 42.33 tokens/s 1.7B 4.7 GB 8192 8192 100% GPU
smollm2:135m 44 193 515.689958ms 19.730167ms 42 token(s) 42.182792ms 995.67 tokens/s 74 token(s) 453.1685ms 163.29 tokens/s 134.52M 1.2 GB 8192 8192 100% GPU
smollm2:360m 9 47 438.924458ms 19.5565ms 42 token(s) 72.164167ms 582.01 tokens/s 17 token(s) 346.542666ms 49.06 tokens/s 361.82M 1.9 GB 8192 8192 100% GPU


System
Ollama proc100% GPU
Ollama context8192
Ollama version0.9.7-rc0
Multirun timeout300 seconds
Sys archarm64
Sys processorarm
sys memory12G + 563M
Sys OSDarwin 24.5.0