Prompt: (raw) (yaml)
words:1 bytes:3
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 7 | 37 | 7.496044041s | 8.195916ms | 21 token(s) | 1.865782666s | 11.26 tokens/s | 11 token(s) | 5.619758875s | 1.96 tokens/s | 6.7B | 14 GB | 16384 | 16384 | 22%/78% CPU/GPU |
cogito:3b | 7 | 34 | 480.737875ms | 30.935584ms | 11 token(s) | 208.224416ms | 52.83 tokens/s | 10 token(s) | 240.957667ms | 41.50 tokens/s | 3.6B | 5.4 GB | 131072 | 16384 | 100% GPU |
cogito:8b | 7 | 34 | 914.160708ms | 32.101375ms | 11 token(s) | 389.816083ms | 28.22 tokens/s | 10 token(s) | 491.644208ms | 20.34 tokens/s | 8.0B | 8.6 GB | 131072 | 16384 | 100% GPU |
deepcoder:1.5b | 8 | 41 | 366.87475ms | 28.482667ms | 4 token(s) | 86.900458ms | 46.03 tokens/s | 16 token(s) | 251.012125ms | 63.74 tokens/s | 1.8B | 2.5 GB | 131072 | 16384 | 100% GPU |
deepseek-r1:1.5b | 8 | 41 | 361.014625ms | 28.12925ms | 4 token(s) | 85.487167ms | 46.79 tokens/s | 16 token(s) | 246.916542ms | 64.80 tokens/s | 1.8B | 2.5 GB | 131072 | 16384 | 100% GPU |
deepseek-r1:14b | 8 | 41 | 11.714486s | 28.601541ms | 4 token(s) | 5.605166875s | 0.71 tokens/s | 16 token(s) | 6.078884958s | 2.63 tokens/s | 14.8B | 13 GB | 131072 | 16384 | 18%/82% CPU/GPU |
deepseek-r1:8b | 8 | 41 | 12.454995792s | 26.733042ms | 3 token(s) | 275.817666ms | 10.88 tokens/s | 204 token(s) | 12.151929625s | 16.79 tokens/s | 8.2B | 9.6 GB | 131072 | 16384 | 100% GPU |
dolphin-mistral:7b | 19 | 105 | 1.736945125s | 14.50725ms | 29 token(s) | 414.698625ms | 69.93 tokens/s | 25 token(s) | 1.306735625s | 19.13 tokens/s | 7.2B | 8.0 GB | 32768 | 16384 | 100% GPU |
dolphin3:8b | 7 | 34 | 1.044578041s | 30.931125ms | 24 token(s) | 515.473041ms | 46.56 tokens/s | 10 token(s) | 497.539709ms | 20.10 tokens/s | 8.0B | 8.6 GB | 131072 | 16384 | 100% GPU |
gemma3:1b | 21 | 112 | 584.507084ms | 52.481792ms | 10 token(s) | 88.276125ms | 113.28 tokens/s | 31 token(s) | 443.148958ms | 69.95 tokens/s | 999.89M | 2.0 GB | 32768 | 16384 | 100% GPU |
gemma3:4b | 43 | 262 | 2.43751775s | 52.602917ms | 10 token(s) | 263.277208ms | 37.98 tokens/s | 62 token(s) | 2.121030625s | 29.23 tokens/s | 4.3B | 6.1 GB | 131072 | 16384 | 100% GPU |
gemma3n:e2b | 30 | 162 | 1.761467958s | 57.79825ms | 10 token(s) | 324.980625ms | 30.77 tokens/s | 44 token(s) | 1.378155167s | 31.93 tokens/s | 4.5B | 5.3 GB | 32768 | 16384 | 100% GPU |
gemma3n:e4b | 23 | 126 | 2.611281625s | 51.955833ms | 10 token(s) | 1.050937416s | 9.52 tokens/s | 33 token(s) | 1.507523042s | 21.89 tokens/s | 6.9B | 6.8 GB | 32768 | 16384 | 100% GPU |
gemma:2b | 15 | 71 | 574.93525ms | 31.615958ms | 23 token(s) | 124.421208ms | 184.86 tokens/s | 21 token(s) | 418.323375ms | 50.20 tokens/s | 2.5B | 3.0 GB | 8192 | 16384 | 100% GPU |
granite3.3:2b | 7 | 36 | 485.900209ms | 19.335584ms | 44 token(s) | 260.444416ms | 168.94 tokens/s | 10 token(s) | 205.377834ms | 48.69 tokens/s | 2.5B | 4.4 GB | 131072 | 16384 | 100% GPU |
granite3.3:8b | 25 | 140 | 2.725187291s | 17.035416ms | 44 token(s) | 687.988792ms | 63.95 tokens/s | 33 token(s) | 2.019495125s | 16.34 tokens/s | 8.2B | 10 GB | 131072 | 16384 | 100% GPU |
hermes3:8b | 17 | 87 | 1.629426416s | 32.045666ms | 10 token(s) | 421.030334ms | 23.75 tokens/s | 22 token(s) | 1.17577625s | 18.71 tokens/s | 8.0B | 8.4 GB | 131072 | 16384 | 100% GPU |
llama3.1:8b-instruct-q4_1 | 17 | 83 | 1.698175417s | 32.823375ms | 11 token(s) | 415.571292ms | 26.47 tokens/s | 21 token(s) | 1.249219541s | 16.81 tokens/s | 8.0B | 8.8 GB | 131072 | 16384 | 100% GPU |
llama3.2:1b | 7 | 36 | 300.101583ms | 30.591458ms | 26 token(s) | 115.865458ms | 224.40 tokens/s | 10 token(s) | 153.003375ms | 65.36 tokens/s | 1.2B | 3.6 GB | 131072 | 16384 | 100% GPU |
llama3.2:3b | 7 | 36 | 530.554792ms | 31.272459ms | 26 token(s) | 256.816833ms | 101.24 tokens/s | 10 token(s) | 241.907042ms | 41.34 tokens/s | 3.2B | 5.4 GB | 131072 | 16384 | 100% GPU |
llava-llama3:8b | 28 | 149 | 2.314740958s | 32.380958ms | 11 token(s) | 379.988625ms | 28.95 tokens/s | 34 token(s) | 1.901847125s | 17.88 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 62 | 313 | 2.504062208s | 13.997583ms | 11 token(s) | 249.314583ms | 44.12 tokens/s | 73 token(s) | 2.240164042s | 32.59 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 18 | 99 | 1.609110042s | 13.93075ms | 9 token(s) | 447.122625ms | 20.13 tokens/s | 23 token(s) | 1.147424416s | 20.04 tokens/s | 7.2B | 8.7 GB | 32768 | 16384 | 100% GPU |
minicpm-v:8b | 41 | 199 | 2.731726417s | 28.0365ms | 9 token(s) | 369.760208ms | 24.34 tokens/s | 48 token(s) | 2.333350917s | 20.57 tokens/s | 7.6B | 7.8 GB | 32768 | 16384 | 100% GPU |
mistral:7b | 60 | 309 | 4.0915575s | 15.199ms | 6 token(s) | 326.919625ms | 18.35 tokens/s | 76 token(s) | 3.74871625s | 20.27 tokens/s | 7.2B | 8.1 GB | 32768 | 16384 | 100% GPU |
mistral:7b-instruct | 26 | 137 | 2.00504425s | 15.8785ms | 5 token(s) | 246.239125ms | 20.31 tokens/s | 35 token(s) | 1.742315583s | 20.09 tokens/s | 7.2B | 8.1 GB | 32768 | 16384 | 100% GPU |
qwen2.5-coder:7b | 7 | 36 | 938.407625ms | 26.62725ms | 30 token(s) | 388.386792ms | 77.24 tokens/s | 10 token(s) | 522.770666ms | 19.13 tokens/s | 7.6B | 7.0 GB | 32768 | 16384 | 100% GPU |
qwen2.5vl:3b | 7 | 34 | 539.373916ms | 30.924458ms | 21 token(s) | 268.499916ms | 78.21 tokens/s | 10 token(s) | 239.425834ms | 41.77 tokens/s | 3.8B | 6.9 GB | 128000 | 16384 | 100% GPU |
qwen2.5vl:7b | 7 | 36 | 2.811726875s | 30.929542ms | 21 token(s) | 2.309022125s | 9.09 tokens/s | 10 token(s) | 471.186167ms | 21.22 tokens/s | 8.3B | 10 GB | 128000 | 16384 | 100% GPU |
qwen3:0.6b | 8 | 38 | 1.015120333s | 28.303875ms | 11 token(s) | 112.278416ms | 97.97 tokens/s | 96 token(s) | 874.0015ms | 109.84 tokens/s | 751.63M | 3.5 GB | 40960 | 16384 | 100% GPU |
qwen3:1.7b | 8 | 41 | 1.511045375s | 27.688166ms | 11 token(s) | 153.132125ms | 71.83 tokens/s | 72 token(s) | 1.329386625s | 54.16 tokens/s | 2.0B | 4.3 GB | 40960 | 16384 | 100% GPU |
qwen3:14b | 8 | 41 | 19.372382208s | 24.952916ms | 11 token(s) | 3.56774825s | 3.08 tokens/s | 107 token(s) | 15.778352125s | 6.78 tokens/s | 14.8B | 14 GB | 40960 | 16384 | 23%/77% CPU/GPU |
qwen3:4b | 8 | 41 | 6.125246042s | 28.184709ms | 11 token(s) | 263.891583ms | 41.68 tokens/s | 162 token(s) | 5.83259925s | 27.77 tokens/s | 4.0B | 7.3 GB | 40960 | 16384 | 100% GPU |
qwen3:8b | 8 | 41 | 4.819035209s | 28.02775ms | 11 token(s) | 415.964292ms | 26.44 tokens/s | 74 token(s) | 4.374498s | 16.92 tokens/s | 8.2B | 9.6 GB | 40960 | 16384 | 100% GPU |
smollm2:1.7b | 7 | 36 | 421.675333ms | 15.736041ms | 30 token(s) | 191.923458ms | 156.31 tokens/s | 10 token(s) | 213.384417ms | 46.86 tokens/s | 1.7B | 4.7 GB | 8192 | 16384 | 100% GPU |
smollm2:135m | 7 | 34 | 130.9035ms | 18.733708ms | 31 token(s) | 49.476083ms | 626.57 tokens/s | 10 token(s) | 62.066542ms | 161.12 tokens/s | 134.52M | 1.2 GB | 8192 | 16384 | 100% GPU |
smollm2:360m | 140 | 861 | 2.043382625s | 17.929834ms | 31 token(s) | 82.7325ms | 374.70 tokens/s | 159 token(s) | 1.942111458s | 81.87 tokens/s | 361.82M | 1.9 GB | 8192 | 16384 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 16384 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 10G + 561M |
Sys OS | Darwin 24.5.0 |