Prompt: (raw) (yaml)
words:7 bytes:39
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 8 | 44 | 1.275854708s | 15.987125ms | 33 token(s) | 510.789416ms | 64.61 tokens/s | 16 token(s) | 748.40125ms | 21.38 tokens/s | 6.7B | 9.4 GB | 16384 | 8192 | 100% GPU |
cogito:3b | 15 | 123 | 1.46703175s | 29.09475ms | 22 token(s) | 175.958875ms | 125.03 tokens/s | 44 token(s) | 1.261389792s | 34.88 tokens/s | 3.6B | 4.0 GB | 131072 | 8192 | 100% GPU |
cogito:8b | 6 | 39 | 1.212427792s | 30.573417ms | 22 token(s) | 465.8525ms | 47.23 tokens/s | 14 token(s) | 715.328958ms | 19.57 tokens/s | 8.0B | 7.0 GB | 131072 | 8192 | 100% GPU |
deepcoder:1.5b | 15 | 103 | 8.294132167s | 28.792042ms | 15 token(s) | 108.045333ms | 138.83 tokens/s | 491 token(s) | 8.15673825s | 60.20 tokens/s | 1.8B | 2.1 GB | 131072 | 8192 | 100% GPU |
deepseek-r1:1.5b | 111 | 630 | 6.41216075s | 28.874875ms | 15 token(s) | 122.494084ms | 122.45 tokens/s | 380 token(s) | 6.260248208s | 60.70 tokens/s | 1.8B | 2.1 GB | 131072 | 8192 | 100% GPU |
deepseek-r1:14b | 64 | 355 | 28.724929458s | 26.45325ms | 15 token(s) | 2.241454916s | 6.69 tokens/s | 251 token(s) | 26.455664917s | 9.49 tokens/s | 14.8B | 11 GB | 131072 | 8192 | 6%/94% CPU/GPU |
deepseek-r1:8b | 60 | 379 | 4m53.853726041s | 29.164583ms | 14 token(s) | 359.23225ms | 38.97 tokens/s | 3920 token(s) | 4m53.464724167s | 13.36 tokens/s | 8.2B | 7.6 GB | 131072 | 8192 | 100% GPU |
dolphin-mistral:7b | 8 | 44 | 1.414776417s | 16.384084ms | 41 token(s) | 654.927333ms | 62.60 tokens/s | 15 token(s) | 742.828ms | 20.19 tokens/s | 7.2B | 6.4 GB | 32768 | 8192 | 100% GPU |
dolphin3:8b | 8 | 43 | 2.191481042s | 30.844042ms | 35 token(s) | 1.350368375s | 25.92 tokens/s | 15 token(s) | 809.6485ms | 18.53 tokens/s | 8.0B | 7.0 GB | 131072 | 8192 | 100% GPU |
gemma3:1b | 8 | 52 | 371.714458ms | 57.177791ms | 20 token(s) | 92.540333ms | 216.12 tokens/s | 16 token(s) | 221.4835ms | 72.24 tokens/s | 999.89M | 2.0 GB | 32768 | 8192 | 100% GPU |
gemma3:4b | 8 | 52 | 827.438917ms | 52.855292ms | 20 token(s) | 264.994708ms | 75.47 tokens/s | 16 token(s) | 509.007917ms | 31.43 tokens/s | 4.3B | 5.9 GB | 131072 | 8192 | 100% GPU |
gemma3n:e2b | 23 | 153 | 2.048573375s | 54.353166ms | 22 token(s) | 261.032167ms | 84.28 tokens/s | 56 token(s) | 1.732406958s | 32.32 tokens/s | 4.5B | 4.8 GB | 32768 | 8192 | 100% GPU |
gemma3n:e4b | 8 | 50 | 2.094495167s | 53.691917ms | 22 token(s) | 1.132075583s | 19.43 tokens/s | 20 token(s) | 908.155125ms | 22.02 tokens/s | 6.9B | 6.2 GB | 32768 | 8192 | 100% GPU |
gemma:2b | 18 | 97 | 852.230625ms | 30.939042ms | 33 token(s) | 162.488709ms | 203.09 tokens/s | 33 token(s) | 658.245375ms | 50.13 tokens/s | 2.5B | 3.0 GB | 8192 | 8192 | 100% GPU |
granite3.3:2b | 10 | 69 | 618.233ms | 19.221875ms | 55 token(s) | 239.818667ms | 229.34 tokens/s | 17 token(s) | 358.519916ms | 47.42 tokens/s | 2.5B | 3.3 GB | 131072 | 8192 | 100% GPU |
granite3.3:8b | 8 | 48 | 1.547848375s | 22.504583ms | 55 token(s) | 649.063042ms | 84.74 tokens/s | 15 token(s) | 875.525583ms | 17.13 tokens/s | 8.2B | 7.9 GB | 131072 | 8192 | 100% GPU |
hermes3:8b | 8 | 47 | 1.097599209s | 31.224625ms | 21 token(s) | 378.580208ms | 55.47 tokens/s | 14 token(s) | 687.235125ms | 20.37 tokens/s | 8.0B | 6.7 GB | 131072 | 8192 | 100% GPU |
llama3.1:8b-instruct-q4_1 | 8 | 43 | 1.988790708s | 35.232ms | 22 token(s) | 1.007706791s | 21.83 tokens/s | 15 token(s) | 945.166334ms | 15.87 tokens/s | 8.0B | 7.2 GB | 131072 | 8192 | 100% GPU |
llama3.2:1b | 8 | 43 | 432.818084ms | 32.411875ms | 37 token(s) | 140.530792ms | 263.29 tokens/s | 15 token(s) | 259.231542ms | 57.86 tokens/s | 1.2B | 2.8 GB | 131072 | 8192 | 100% GPU |
llama3.2:3b | 8 | 43 | 722.350291ms | 31.565958ms | 37 token(s) | 293.051667ms | 126.26 tokens/s | 15 token(s) | 397.11175ms | 37.77 tokens/s | 3.2B | 4.0 GB | 131072 | 8192 | 100% GPU |
llava-llama3:8b | 8 | 46 | 1.335298333s | 32.757291ms | 24 token(s) | 368.959ms | 65.05 tokens/s | 15 token(s) | 932.68975ms | 16.08 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 30 | 165 | 2.077716125s | 15.39725ms | 23 token(s) | 349.324875ms | 65.84 tokens/s | 50 token(s) | 1.712277416s | 29.20 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 8 | 48 | 1.269078417s | 15.465959ms | 21 token(s) | 472.957917ms | 44.40 tokens/s | 16 token(s) | 779.879875ms | 20.52 tokens/s | 7.2B | 7.0 GB | 32768 | 8192 | 100% GPU |
minicpm-v:8b | 8 | 42 | 1.505774458s | 28.229458ms | 20 token(s) | 560.415709ms | 35.69 tokens/s | 16 token(s) | 916.602666ms | 17.46 tokens/s | 7.6B | 6.8 GB | 32768 | 8192 | 100% GPU |
mistral:7b | 9 | 53 | 1.409752291s | 17.062708ms | 18 token(s) | 467.236417ms | 38.52 tokens/s | 18 token(s) | 924.629125ms | 19.47 tokens/s | 7.2B | 6.4 GB | 32768 | 8192 | 100% GPU |
mistral:7b-instruct | 9 | 51 | 1.121227125s | 16.57025ms | 17 token(s) | 326.959459ms | 51.99 tokens/s | 16 token(s) | 776.979666ms | 20.59 tokens/s | 7.2B | 6.4 GB | 32768 | 8192 | 100% GPU |
qwen2.5-coder:7b | 6 | 40 | 1.509367209s | 27.933959ms | 41 token(s) | 572.989667ms | 71.55 tokens/s | 15 token(s) | 907.753167ms | 16.52 tokens/s | 7.6B | 6.0 GB | 32768 | 8192 | 100% GPU |
qwen2.5vl:3b | 8 | 43 | 733.198417ms | 30.653208ms | 32 token(s) | 295.706ms | 108.22 tokens/s | 15 token(s) | 406.281458ms | 36.92 tokens/s | 3.8B | 6.2 GB | 128000 | 8192 | 100% GPU |
qwen2.5vl:7b | 6 | 42 | 3.011269833s | 30.196458ms | 32 token(s) | 2.238150959s | 14.30 tokens/s | 14 token(s) | 742.409541ms | 18.86 tokens/s | 8.3B | 9.1 GB | 128000 | 8192 | 100% GPU |
qwen3:0.6b | 12 | 81 | 3.828666625s | 30.661708ms | 22 token(s) | 86.412792ms | 254.59 tokens/s | 357 token(s) | 3.711006375s | 96.20 tokens/s | 751.63M | 2.3 GB | 40960 | 8192 | 100% GPU |
qwen3:1.7b | 67 | 431 | 12.36374325s | 28.635834ms | 22 token(s) | 127.750667ms | 172.21 tokens/s | 629 token(s) | 12.206742708s | 51.53 tokens/s | 2.0B | 3.0 GB | 40960 | 8192 | 100% GPU |
qwen3:14b | 74 | 442 | 1m38.397041125s | 25.883583ms | 22 token(s) | 5.190262833s | 4.24 tokens/s | 780 token(s) | 1m33.180101209s | 8.37 tokens/s | 14.8B | 12 GB | 40960 | 8192 | 5%/95% CPU/GPU |
qwen3:4b | 42 | 212 | 15.839626375s | 28.034458ms | 22 token(s) | 239.629167ms | 91.81 tokens/s | 388 token(s) | 15.571264875s | 24.92 tokens/s | 4.0B | 5.3 GB | 40960 | 8192 | 100% GPU |
qwen3:8b | 67 | 388 | 1m28.467323916s | 28.754375ms | 22 token(s) | 384.27575ms | 57.25 tokens/s | 1309 token(s) | 1m28.05370325s | 14.87 tokens/s | 8.2B | 7.6 GB | 40960 | 8192 | 100% GPU |
smollm2:1.7b | 8 | 44 | 611.793ms | 21.263167ms | 41 token(s) | 235.373125ms | 174.19 tokens/s | 15 token(s) | 354.36275ms | 42.33 tokens/s | 1.7B | 4.7 GB | 8192 | 8192 | 100% GPU |
smollm2:135m | 44 | 193 | 515.689958ms | 19.730167ms | 42 token(s) | 42.182792ms | 995.67 tokens/s | 74 token(s) | 453.1685ms | 163.29 tokens/s | 134.52M | 1.2 GB | 8192 | 8192 | 100% GPU |
smollm2:360m | 9 | 47 | 438.924458ms | 19.5565ms | 42 token(s) | 72.164167ms | 582.01 tokens/s | 17 token(s) | 346.542666ms | 49.06 tokens/s | 361.82M | 1.9 GB | 8192 | 8192 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 8192 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 12G + 563M |
Sys OS | Darwin 24.5.0 |