Prompt: (raw) (yaml)
words:1 bytes:3
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 1 | 9 | 497.813917ms | 14.659833ms | 21 token(s) | 344.875125ms | 60.89 tokens/s | 4 token(s) | 137.478541ms | 29.10 tokens/s | 6.7B | 6.9 GB | 16384 | 4096 | 100% GPU |
cogito:3b | 7 | 34 | 469.960625ms | 30.296667ms | 11 token(s) | 197.832459ms | 55.60 tokens/s | 10 token(s) | 241.289791ms | 41.44 tokens/s | 3.6B | 3.4 GB | 131072 | 4096 | 100% GPU |
cogito:8b | 7 | 34 | 966.615084ms | 31.19425ms | 11 token(s) | 379.475375ms | 28.99 tokens/s | 10 token(s) | 555.347208ms | 18.01 tokens/s | 8.0B | 6.2 GB | 131072 | 4096 | 100% GPU |
deepcoder:1.5b | 8 | 41 | 374.796833ms | 27.647541ms | 4 token(s) | 99.816875ms | 40.07 tokens/s | 16 token(s) | 246.866541ms | 64.81 tokens/s | 1.8B | 2.0 GB | 131072 | 4096 | 100% GPU |
deepseek-r1:1.5b | 8 | 41 | 375.294583ms | 27.099417ms | 4 token(s) | 99.749625ms | 40.10 tokens/s | 16 token(s) | 247.906458ms | 64.54 tokens/s | 1.8B | 2.0 GB | 131072 | 4096 | 100% GPU |
deepseek-r1:14b | 8 | 41 | 6.202296791s | 26.651625ms | 4 token(s) | 4.655969417s | 0.86 tokens/s | 16 token(s) | 1.51909775s | 10.53 tokens/s | 14.8B | 10 GB | 131072 | 4096 | 100% GPU |
deepseek-r1:8b | 4 | 141 | 16.99610625s | 28.172667ms | 3 token(s) | 856.395917ms | 3.50 tokens/s | 272 token(s) | 16.110962458s | 16.88 tokens/s | 8.2B | 6.6 GB | 131072 | 4096 | 100% GPU |
dolphin-mistral:7b | 7 | 37 | 797.633042ms | 14.916458ms | 29 token(s) | 352.46375ms | 82.28 tokens/s | 10 token(s) | 429.576958ms | 23.28 tokens/s | 7.2B | 5.6 GB | 32768 | 4096 | 100% GPU |
dolphin3:8b | 7 | 36 | 927.930125ms | 31.250584ms | 24 token(s) | 404.7735ms | 59.29 tokens/s | 10 token(s) | 491.312083ms | 20.35 tokens/s | 8.0B | 6.2 GB | 131072 | 4096 | 100% GPU |
gemma3:1b | 19 | 104 | 654.974167ms | 54.890959ms | 10 token(s) | 96.766292ms | 103.34 tokens/s | 29 token(s) | 502.676667ms | 57.69 tokens/s | 999.89M | 1.9 GB | 32768 | 4096 | 100% GPU |
gemma3:4b | 23 | 124 | 1.544395209s | 52.220334ms | 10 token(s) | 258.3655ms | 38.70 tokens/s | 34 token(s) | 1.233234959s | 27.57 tokens/s | 4.3B | 5.8 GB | 131072 | 4096 | 100% GPU |
gemma3n:e2b | 27 | 148 | 1.58956625s | 51.818792ms | 10 token(s) | 339.500791ms | 29.46 tokens/s | 38 token(s) | 1.197678167s | 31.73 tokens/s | 4.5B | 4.6 GB | 32768 | 4096 | 100% GPU |
gemma3n:e4b | 40 | 215 | 3.511673666s | 63.349875ms | 10 token(s) | 813.537917ms | 12.29 tokens/s | 58 token(s) | 2.634209083s | 22.02 tokens/s | 6.9B | 5.9 GB | 32768 | 4096 | 100% GPU |
gemma:2b | 15 | 72 | 577.13675ms | 30.15225ms | 23 token(s) | 133.882542ms | 171.79 tokens/s | 21 token(s) | 412.570916ms | 50.90 tokens/s | 2.5B | 2.9 GB | 8192 | 4096 | 100% GPU |
granite3.3:2b | 7 | 36 | 439.64875ms | 16.752791ms | 44 token(s) | 219.724584ms | 200.25 tokens/s | 10 token(s) | 202.501916ms | 49.38 tokens/s | 2.5B | 2.7 GB | 131072 | 4096 | 100% GPU |
granite3.3:8b | 20 | 112 | 2.143904s | 18.239125ms | 44 token(s) | 598.803584ms | 73.48 tokens/s | 25 token(s) | 1.526170416s | 16.38 tokens/s | 8.2B | 6.7 GB | 131072 | 4096 | 100% GPU |
hermes3:8b | 27 | 138 | 2.203283041s | 31.189333ms | 10 token(s) | 413.564875ms | 24.18 tokens/s | 34 token(s) | 1.757992083s | 19.34 tokens/s | 8.0B | 5.9 GB | 131072 | 4096 | 100% GPU |
llama3.1:8b-instruct-q4_1 | 17 | 83 | 1.664660958s | 31.537ms | 11 token(s) | 395.692208ms | 27.80 tokens/s | 21 token(s) | 1.236849209s | 16.98 tokens/s | 8.0B | 6.3 GB | 131072 | 4096 | 100% GPU |
llama3.2:1b | 15 | 74 | 436.7785ms | 30.121125ms | 26 token(s) | 116.6285ms | 222.93 tokens/s | 18 token(s) | 289.4215ms | 62.19 tokens/s | 1.2B | 2.4 GB | 131072 | 4096 | 100% GPU |
llama3.2:3b | 6 | 29 | 435.846917ms | 31.99675ms | 26 token(s) | 211.301625ms | 123.05 tokens/s | 8 token(s) | 191.934125ms | 41.68 tokens/s | 3.2B | 3.4 GB | 131072 | 4096 | 100% GPU |
llava-llama3:8b | 19 | 102 | 1.715882s | 33.016416ms | 11 token(s) | 354.78375ms | 31.00 tokens/s | 24 token(s) | 1.327556542s | 18.08 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 35 | 174 | 1.671593292s | 12.948709ms | 11 token(s) | 314.039542ms | 35.03 tokens/s | 42 token(s) | 1.343985083s | 31.25 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 7 | 36 | 813.578792ms | 11.852792ms | 9 token(s) | 315.916ms | 28.49 tokens/s | 11 token(s) | 485.060083ms | 22.68 tokens/s | 7.2B | 6.2 GB | 32768 | 4096 | 100% GPU |
minicpm-v:8b | 19 | 98 | 1.56258875s | 30.092709ms | 9 token(s) | 299.036375ms | 30.10 tokens/s | 24 token(s) | 1.232852083s | 19.47 tokens/s | 7.6B | 6.4 GB | 32768 | 4096 | 100% GPU |
mistral:7b | 26 | 137 | 2.065637708s | 14.618042ms | 6 token(s) | 280.39025ms | 21.40 tokens/s | 35 token(s) | 1.769933584s | 19.77 tokens/s | 7.2B | 5.6 GB | 32768 | 4096 | 100% GPU |
mistral:7b-instruct | 58 | 302 | 3.743591167s | 14.958625ms | 5 token(s) | 215.478833ms | 23.20 tokens/s | 73 token(s) | 3.512449959s | 20.78 tokens/s | 7.2B | 5.6 GB | 32768 | 4096 | 100% GPU |
qwen2.5-coder:7b | 26 | 134 | 2.093362334s | 28.237625ms | 30 token(s) | 349.5695ms | 85.82 tokens/s | 31 token(s) | 1.714970375s | 18.08 tokens/s | 7.6B | 5.6 GB | 32768 | 4096 | 100% GPU |
qwen2.5vl:3b | 7 | 34 | 513.824125ms | 33.121209ms | 21 token(s) | 242.067208ms | 86.75 tokens/s | 10 token(s) | 238.036667ms | 42.01 tokens/s | 3.8B | 5.9 GB | 128000 | 4096 | 100% GPU |
qwen2.5vl:7b | 7 | 36 | 882.426041ms | 30.229041ms | 21 token(s) | 382.767542ms | 54.86 tokens/s | 10 token(s) | 468.84975ms | 21.33 tokens/s | 8.3B | 8.6 GB | 128000 | 4096 | 100% GPU |
qwen3:0.6b | 8 | 41 | 1.006703834s | 28.267709ms | 11 token(s) | 72.004083ms | 152.77 tokens/s | 100 token(s) | 905.897375ms | 110.39 tokens/s | 751.63M | 1.6 GB | 40960 | 4096 | 100% GPU |
qwen3:1.7b | 8 | 41 | 1.6133555s | 28.056458ms | 11 token(s) | 125.134792ms | 87.91 tokens/s | 79 token(s) | 1.459613041s | 54.12 tokens/s | 2.0B | 2.4 GB | 40960 | 4096 | 100% GPU |
qwen3:14b | 8 | 41 | 13.764914708s | 28.407791ms | 11 token(s) | 4.372116292s | 2.52 tokens/s | 92 token(s) | 9.363718833s | 9.83 tokens/s | 14.8B | 10 GB | 40960 | 4096 | 100% GPU |
qwen3:4b | 8 | 41 | 5.392521541s | 26.396375ms | 11 token(s) | 280.079375ms | 39.27 tokens/s | 141 token(s) | 5.085552458s | 27.73 tokens/s | 4.0B | 4.2 GB | 40960 | 4096 | 100% GPU |
qwen3:8b | 8 | 41 | 8.055148792s | 28.073459ms | 11 token(s) | 369.810833ms | 29.74 tokens/s | 131 token(s) | 7.656716708s | 17.11 tokens/s | 8.2B | 6.6 GB | 40960 | 4096 | 100% GPU |
smollm2:1.7b | 18 | 100 | 680.658084ms | 14.208167ms | 30 token(s) | 167.3705ms | 179.24 tokens/s | 22 token(s) | 498.197333ms | 44.16 tokens/s | 1.7B | 3.6 GB | 8192 | 4096 | 100% GPU |
smollm2:135m | 47 | 266 | 435.152458ms | 18.046417ms | 31 token(s) | 38.752958ms | 799.94 tokens/s | 62 token(s) | 377.741875ms | 164.13 tokens/s | 134.52M | 1.0 GB | 8192 | 4096 | 100% GPU |
smollm2:360m | 11 | 57 | 264.989792ms | 18.488167ms | 31 token(s) | 71.037625ms | 436.39 tokens/s | 16 token(s) | 174.809791ms | 91.53 tokens/s | 361.82M | 1.6 GB | 8192 | 4096 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 4096 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 11G + 887M |
Sys OS | Darwin 24.5.0 |