Prompt: (raw) (yaml)
words:1 bytes:3
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 7 | 37 | 7.309737541s | 11.341ms | 21 token(s) | 1.921307542s | 10.93 tokens/s | 11 token(s) | 5.373215375s | 2.05 tokens/s | 6.7B | 14 GB | 16384 | 65536 | 22%/78% CPU/GPU |
cogito:3b | 7 | 34 | 1.740782125s | 29.215917ms | 11 token(s) | 1.261909375s | 8.72 tokens/s | 10 token(s) | 448.8695ms | 22.28 tokens/s | 3.6B | 13 GB | 131072 | 65536 | 19%/81% CPU/GPU |
cogito:8b | 7 | 34 | 16.729648166s | 31.211416ms | 11 token(s) | 2.761316084s | 3.98 tokens/s | 10 token(s) | 13.932685833s | 0.72 tokens/s | 8.0B | 18 GB | 131072 | 65536 | 39%/61% CPU/GPU |
deepcoder:1.5b | 8 | 41 | 530.801ms | 27.455625ms | 4 token(s) | 148.705167ms | 26.90 tokens/s | 16 token(s) | 354.085375ms | 45.19 tokens/s | 1.8B | 5.2 GB | 131072 | 65536 | 100% GPU |
deepseek-r1:1.5b | 8 | 41 | 407.869709ms | 28.307125ms | 4 token(s) | 138.396958ms | 28.90 tokens/s | 16 token(s) | 240.608083ms | 66.50 tokens/s | 1.8B | 5.2 GB | 131072 | 65536 | 100% GPU |
deepseek-r1:14b | 8 | 41 | 4m47.48131575s | 26.51325ms | 4 token(s) | 14.337323208s | 0.28 tokens/s | 16 token(s) | 4m33.104934875s | 0.06 tokens/s | 14.8B | 27 GB | 131072 | 65536 | 59%/41% CPU/GPU |
deepseek-r1:8b | 4 | 72 | 1m12.251077459s | 32.131542ms | 3 token(s) | 3.165768208s | 0.95 tokens/s | 208 token(s) | 1m9.049341167s | 3.01 tokens/s | 8.2B | 21 GB | 131072 | 65536 | 48%/52% CPU/GPU |
dolphin-mistral:7b | 7 | 37 | 880.136458ms | 13.437042ms | 29 token(s) | 428.888958ms | 67.62 tokens/s | 10 token(s) | 437.092042ms | 22.88 tokens/s | 7.2B | 11 GB | 32768 | 65536 | 100% GPU |
dolphin3:8b | 7 | 34 | 13.58731s | 34.307958ms | 24 token(s) | 2.948011917s | 8.14 tokens/s | 10 token(s) | 10.599175333s | 0.94 tokens/s | 8.0B | 18 GB | 131072 | 65536 | 39%/61% CPU/GPU |
gemma3:1b | 21 | 109 | 744.9395ms | 51.957666ms | 10 token(s) | 90.721ms | 110.23 tokens/s | 31 token(s) | 601.734208ms | 51.52 tokens/s | 999.89M | 2.1 GB | 32768 | 65536 | 100% GPU |
gemma3:4b | 43 | 262 | 2.317934667s | 52.243375ms | 10 token(s) | 282.760542ms | 35.37 tokens/s | 60 token(s) | 1.9823245s | 30.27 tokens/s | 4.3B | 7.8 GB | 131072 | 65536 | 100% GPU |
gemma3n:e2b | 30 | 166 | 1.798621375s | 52.558291ms | 10 token(s) | 299.000709ms | 33.44 tokens/s | 46 token(s) | 1.446518625s | 31.80 tokens/s | 4.5B | 6.7 GB | 32768 | 65536 | 100% GPU |
gemma3n:e4b | 71 | 445 | 6.407183875s | 51.019042ms | 10 token(s) | 773.02575ms | 12.94 tokens/s | 124 token(s) | 5.58261s | 22.21 tokens/s | 6.9B | 8.3 GB | 32768 | 65536 | 100% GPU |
gemma:2b | 15 | 68 | 574.117917ms | 30.510083ms | 23 token(s) | 129.751042ms | 177.26 tokens/s | 21 token(s) | 413.340833ms | 50.81 tokens/s | 2.5B | 3.0 GB | 8192 | 65536 | 100% GPU |
granite3.3:2b | 0 | 0 | 2.5B | 11 GB | 131072 | 65536 | 100% GPU | ||||||||
granite3.3:8b | 16 | 89 | 45.141988708s | 13.080208ms | 44 token(s) | 4.565584834s | 9.64 tokens/s | 21 token(s) | 40.559959875s | 0.52 tokens/s | 8.2B | 23 GB | 131072 | 65536 | 53%/47% CPU/GPU |
hermes3:8b | 27 | 138 | 24.264792708s | 30.54525ms | 10 token(s) | 1.901765875s | 5.26 tokens/s | 34 token(s) | 22.32780225s | 1.52 tokens/s | 8.0B | 18 GB | 131072 | 65536 | 39%/61% CPU/GPU |
llama3.1:8b-instruct-q4_1 | 17 | 83 | 51.654558s | 32.409458ms | 11 token(s) | 2.129417958s | 5.17 tokens/s | 21 token(s) | 49.483365458s | 0.42 tokens/s | 8.0B | 18 GB | 131072 | 65536 | 41%/59% CPU/GPU |
llama3.2:1b | 15 | 74 | 624.788875ms | 29.985041ms | 26 token(s) | 264.679792ms | 98.23 tokens/s | 18 token(s) | 329.4825ms | 54.63 tokens/s | 1.2B | 8.6 GB | 131072 | 65536 | 100% GPU |
llama3.2:3b | 6 | 27 | 1.773693292s | 30.544042ms | 26 token(s) | 1.353082208s | 19.22 tokens/s | 8 token(s) | 389.245667ms | 20.55 tokens/s | 3.2B | 13 GB | 131072 | 65536 | 19%/81% CPU/GPU |
llava-llama3:8b | 7 | 34 | 916.476ms | 30.556125ms | 11 token(s) | 342.036958ms | 32.16 tokens/s | 10 token(s) | 543.397417ms | 18.40 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 59 | 297 | 2.708403542s | 7.662292ms | 11 token(s) | 261.366917ms | 42.09 tokens/s | 72 token(s) | 2.438946416s | 29.52 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 17 | 90 | 2.371292875s | 12.264041ms | 9 token(s) | 1.178164s | 7.64 tokens/s | 22 token(s) | 1.1801555s | 18.64 tokens/s | 7.2B | 11 GB | 32768 | 65536 | 5%/95% CPU/GPU |
minicpm-v:8b | 19 | 98 | 1.744411917s | 27.714083ms | 9 token(s) | 363.045917ms | 24.79 tokens/s | 24 token(s) | 1.35306725s | 17.74 tokens/s | 7.6B | 9.7 GB | 32768 | 65536 | 100% GPU |
mistral:7b | 42 | 229 | 4.747577667s | 15.93275ms | 6 token(s) | 1.907520417s | 3.15 tokens/s | 54 token(s) | 2.82342525s | 19.13 tokens/s | 7.2B | 11 GB | 32768 | 65536 | 100% GPU |
mistral:7b-instruct | 63 | 336 | 4.916012458s | 14.876667ms | 5 token(s) | 382.868125ms | 13.06 tokens/s | 83 token(s) | 4.517585s | 18.37 tokens/s | 7.2B | 11 GB | 32768 | 65536 | 100% GPU |
qwen2.5-coder:7b | 7 | 36 | 1.122770584s | 27.414042ms | 30 token(s) | 579.547208ms | 51.76 tokens/s | 10 token(s) | 515.099125ms | 19.41 tokens/s | 7.6B | 9.0 GB | 32768 | 65536 | 100% GPU |
qwen2.5vl:3b | 7 | 34 | 2.142355583s | 29.745ms | 21 token(s) | 1.8427445s | 11.40 tokens/s | 10 token(s) | 269.285458ms | 37.14 tokens/s | 3.8B | 11 GB | 128000 | 65536 | 100% GPU |
qwen2.5vl:7b | 0 | 0 | 8.3B | 128000 | |||||||||||
qwen3:0.6b | 8 | 41 | 1.477805459s | 26.461875ms | 11 token(s) | 185.229458ms | 59.39 tokens/s | 132 token(s) | 1.265550084s | 104.30 tokens/s | 751.63M | 7.4 GB | 40960 | 65536 | 100% GPU |
qwen3:1.7b | 8 | 41 | 3.16612525s | 27.420875ms | 11 token(s) | 231.464ms | 47.52 tokens/s | 128 token(s) | 2.906553292s | 44.04 tokens/s | 2.0B | 8.2 GB | 40960 | 65536 | 100% GPU |
qwen3:14b | 6 | 44 | 14.8B | 22 GB | 40960 | 65536 | 48%/52% CPU/GPU | ||||||||
qwen3:4b | 8 | 41 | 5.7548455s | 26.076334ms | 11 token(s) | 887.559625ms | 12.39 tokens/s | 95 token(s) | 4.840550042s | 19.63 tokens/s | 4.0B | 13 GB | 40960 | 65536 | 16%/84% CPU/GPU |
qwen3:8b | 8 | 41 | 40.943485083s | 28.261917ms | 11 token(s) | 2.051716833s | 5.36 tokens/s | 100 token(s) | 38.859804792s | 2.57 tokens/s | 8.2B | 15 GB | 40960 | 65536 | 29%/71% CPU/GPU |
smollm2:1.7b | 7 | 36 | 425.299792ms | 18.228875ms | 30 token(s) | 170.040958ms | 176.43 tokens/s | 10 token(s) | 236.203792ms | 42.34 tokens/s | 1.7B | 4.7 GB | 8192 | 65536 | 100% GPU |
smollm2:135m | 7 | 34 | 136.103542ms | 18.848667ms | 31 token(s) | 50.467084ms | 614.26 tokens/s | 10 token(s) | 66.044416ms | 151.41 tokens/s | 134.52M | 1.2 GB | 8192 | 65536 | 100% GPU |
smollm2:360m | 7 | 34 | 206.924167ms | 16.071584ms | 31 token(s) | 78.222833ms | 396.30 tokens/s | 10 token(s) | 111.967292ms | 89.31 tokens/s | 361.82M | 1.9 GB | 8192 | 65536 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 65536 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 8097M + 770M |
Sys OS | Darwin 24.5.0 |