Prompt: (raw) (yaml)
words:1 bytes:3
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 7 | 37 | 7.318774833s | 12.206166ms | 21 token(s) | 1.798892375s | 11.67 tokens/s | 11 token(s) | 5.505033833s | 2.00 tokens/s | 6.7B | 14 GB | 16384 | 32768 | 22%/78% CPU/GPU |
cogito:3b | 13 | 66 | 710.93625ms | 29.87025ms | 11 token(s) | 255.17325ms | 43.11 tokens/s | 17 token(s) | 425.314458ms | 39.97 tokens/s | 3.6B | 8.2 GB | 131072 | 32768 | 100% GPU |
cogito:8b | 7 | 34 | 1.939169875s | 30.937667ms | 11 token(s) | 1.349918167s | 8.15 tokens/s | 10 token(s) | 557.5675ms | 17.94 tokens/s | 8.0B | 11 GB | 131072 | 32768 | 6%/94% CPU/GPU |
deepcoder:1.5b | 8 | 41 | 393.594334ms | 29.569584ms | 4 token(s) | 112.979042ms | 35.40 tokens/s | 16 token(s) | 250.533958ms | 63.86 tokens/s | 1.8B | 3.4 GB | 131072 | 32768 | 100% GPU |
deepseek-r1:1.5b | 8 | 41 | 379.263542ms | 28.160208ms | 4 token(s) | 108.77325ms | 36.77 tokens/s | 16 token(s) | 241.740209ms | 66.19 tokens/s | 1.8B | 3.4 GB | 131072 | 32768 | 100% GPU |
deepseek-r1:14b | 8 | 41 | 4m16.726814125s | 33.730917ms | 4 token(s) | 22.814853042s | 0.18 tokens/s | 16 token(s) | 3m53.860393708s | 0.07 tokens/s | 14.8B | 18 GB | 131072 | 32768 | 39%/61% CPU/GPU |
deepseek-r1:8b | 8 | 41 | 15.651779667s | 26.174834ms | 3 token(s) | 1.765074459s | 1.70 tokens/s | 193 token(s) | 13.858930708s | 13.93 tokens/s | 8.2B | 13 GB | 131072 | 32768 | 17%/83% CPU/GPU |
dolphin-mistral:7b | 7 | 37 | 877.112333ms | 14.698083ms | 29 token(s) | 432.431417ms | 67.06 tokens/s | 10 token(s) | 429.358791ms | 23.29 tokens/s | 7.2B | 11 GB | 32768 | 32768 | 100% GPU |
dolphin3:8b | 7 | 34 | 2.297492583s | 31.014708ms | 24 token(s) | 1.68611675s | 14.23 tokens/s | 10 token(s) | 579.596125ms | 17.25 tokens/s | 8.0B | 11 GB | 131072 | 32768 | 6%/94% CPU/GPU |
gemma3:1b | 21 | 113 | 580.834667ms | 52.913875ms | 10 token(s) | 83.686958ms | 119.49 tokens/s | 31 token(s) | 443.713417ms | 69.86 tokens/s | 999.89M | 2.1 GB | 32768 | 32768 | 100% GPU |
gemma3:4b | 23 | 124 | 1.510115625s | 51.803583ms | 10 token(s) | 223.249625ms | 44.79 tokens/s | 34 token(s) | 1.234511459s | 27.54 tokens/s | 4.3B | 6.5 GB | 131072 | 32768 | 100% GPU |
gemma3n:e2b | 35 | 179 | 1.981714125s | 54.058167ms | 10 token(s) | 350.426541ms | 28.54 tokens/s | 51 token(s) | 1.576327667s | 32.35 tokens/s | 4.5B | 6.7 GB | 32768 | 32768 | 100% GPU |
gemma3n:e4b | 93 | 591 | 8.058955667s | 67.282792ms | 10 token(s) | 1.138130791s | 8.79 tokens/s | 152 token(s) | 6.852881917s | 22.18 tokens/s | 6.9B | 8.3 GB | 32768 | 32768 | 100% GPU |
gemma:2b | 15 | 71 | 579.041625ms | 28.5505ms | 23 token(s) | 129.086958ms | 178.17 tokens/s | 21 token(s) | 420.783875ms | 49.91 tokens/s | 2.5B | 3.0 GB | 8192 | 32768 | 100% GPU |
granite3.3:2b | 7 | 36 | 541.768083ms | 18.728166ms | 44 token(s) | 318.882042ms | 137.98 tokens/s | 10 token(s) | 203.409292ms | 49.16 tokens/s | 2.5B | 6.7 GB | 131072 | 32768 | 100% GPU |
granite3.3:8b | 32 | 173 | 5.2642965s | 17.91925ms | 44 token(s) | 2.641407625s | 16.66 tokens/s | 38 token(s) | 2.604155083s | 14.59 tokens/s | 8.2B | 14 GB | 131072 | 32768 | 24%/76% CPU/GPU |
hermes3:8b | 31 | 156 | 3.302221209s | 30.288042ms | 10 token(s) | 1.176843042s | 8.50 tokens/s | 38 token(s) | 2.093665708s | 18.15 tokens/s | 8.0B | 11 GB | 131072 | 32768 | 4%/96% CPU/GPU |
llama3.1:8b-instruct-q4_1 | 17 | 83 | 2.422103833s | 31.1585ms | 11 token(s) | 1.144180208s | 9.61 tokens/s | 21 token(s) | 1.245195542s | 16.86 tokens/s | 8.0B | 12 GB | 131072 | 32768 | 6%/94% CPU/GPU |
llama3.2:1b | 15 | 74 | 497.795333ms | 32.086708ms | 26 token(s) | 173.610958ms | 149.76 tokens/s | 18 token(s) | 291.595125ms | 61.73 tokens/s | 1.2B | 5.3 GB | 131072 | 32768 | 100% GPU |
llama3.2:3b | 7 | 36 | 552.623417ms | 31.309417ms | 26 token(s) | 267.727625ms | 97.11 tokens/s | 10 token(s) | 252.955625ms | 39.53 tokens/s | 3.2B | 8.2 GB | 131072 | 32768 | 100% GPU |
llava-llama3:8b | 7 | 34 | 877.102791ms | 33.316083ms | 11 token(s) | 345.162833ms | 31.87 tokens/s | 10 token(s) | 497.937583ms | 20.08 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 44 | 235 | 2.063754542s | 14.184792ms | 11 token(s) | 309.357375ms | 35.56 tokens/s | 55 token(s) | 1.739573s | 31.62 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 27 | 144 | 3.164168s | 12.906375ms | 9 token(s) | 1.229238666s | 7.32 tokens/s | 35 token(s) | 1.92123675s | 18.22 tokens/s | 7.2B | 11 GB | 32768 | 32768 | 5%/95% CPU/GPU |
minicpm-v:8b | 19 | 98 | 1.523135625s | 26.91625ms | 9 token(s) | 339.972208ms | 26.47 tokens/s | 24 token(s) | 1.155715917s | 20.77 tokens/s | 7.6B | 9.7 GB | 32768 | 32768 | 100% GPU |
mistral:7b | 41 | 214 | 4.041069417s | 14.776292ms | 6 token(s) | 1.704108875s | 3.52 tokens/s | 49 token(s) | 2.321473833s | 21.11 tokens/s | 7.2B | 11 GB | 32768 | 32768 | 100% GPU |
mistral:7b-instruct | 46 | 274 | 3.208623625s | 14.369875ms | 5 token(s) | 315.02775ms | 15.87 tokens/s | 59 token(s) | 2.878426166s | 20.50 tokens/s | 7.2B | 11 GB | 32768 | 32768 | 100% GPU |
qwen2.5-coder:7b | 7 | 36 | 936.377958ms | 28.124958ms | 30 token(s) | 432.975625ms | 69.29 tokens/s | 10 token(s) | 474.693334ms | 21.07 tokens/s | 7.6B | 9.0 GB | 32768 | 32768 | 100% GPU |
qwen2.5vl:3b | 7 | 34 | 1.223491625s | 31.164333ms | 21 token(s) | 921.240084ms | 22.80 tokens/s | 10 token(s) | 270.489208ms | 36.97 tokens/s | 3.8B | 8.4 GB | 128000 | 32768 | 100% GPU |
qwen2.5vl:7b | 7 | 36 | 3.07941175s | 28.616542ms | 21 token(s) | 2.54527175s | 8.25 tokens/s | 10 token(s) | 504.85025ms | 19.81 tokens/s | 8.3B | 12 GB | 128000 | 32768 | 28%/72% CPU/GPU |
qwen3:0.6b | 8 | 41 | 1.059536125s | 25.274917ms | 11 token(s) | 139.424916ms | 78.90 tokens/s | 83 token(s) | 894.212334ms | 92.82 tokens/s | 751.63M | 6.1 GB | 40960 | 32768 | 100% GPU |
qwen3:1.7b | 7 | 36 | 1.987120291s | 27.959833ms | 11 token(s) | 187.013792ms | 58.82 tokens/s | 94 token(s) | 1.771676375s | 53.06 tokens/s | 2.0B | 6.9 GB | 40960 | 32768 | 100% GPU |
qwen3:14b | 6 | 44 | 14.8B | 19 GB | 40960 | 32768 | 43%/57% CPU/GPU | ||||||||
qwen3:4b | 8 | 41 | 4.176501792s | 25.886667ms | 11 token(s) | 337.120708ms | 32.63 tokens/s | 107 token(s) | 3.812862209s | 28.06 tokens/s | 4.0B | 11 GB | 40960 | 32768 | 100% GPU |
qwen3:8b | 20 | 102 | 7.803055125s | 27.824584ms | 11 token(s) | 1.71373825s | 6.42 tokens/s | 87 token(s) | 6.060735542s | 14.35 tokens/s | 8.2B | 13 GB | 40960 | 32768 | 17%/83% CPU/GPU |
smollm2:1.7b | 7 | 36 | 398.137042ms | 17.136583ms | 30 token(s) | 164.6305ms | 182.23 tokens/s | 10 token(s) | 215.675042ms | 46.37 tokens/s | 1.7B | 4.7 GB | 8192 | 32768 | 100% GPU |
smollm2:135m | 7 | 34 | 131.097709ms | 18.056209ms | 31 token(s) | 50.118209ms | 618.54 tokens/s | 10 token(s) | 62.263583ms | 160.61 tokens/s | 134.52M | 1.2 GB | 8192 | 32768 | 100% GPU |
smollm2:360m | 7 | 34 | 203.504125ms | 17.938958ms | 31 token(s) | 80.67375ms | 384.26 tokens/s | 10 token(s) | 103.984ms | 96.17 tokens/s | 361.82M | 1.9 GB | 8192 | 32768 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 32768 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 7638M + 1268M |
Sys OS | Darwin 24.5.0 |