Prompt: (raw) (yaml)
words:1 bytes:3
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:7b | 1 | 9 | 3.33205325s | 13.606167ms | 21 token(s) | 1.738607958s | 12.08 tokens/s | 4 token(s) | 1.577982084s | 2.53 tokens/s | 6.7B | 14 GB | 16384 | 131072 | 22%/78% CPU/GPU |
cogito:3b | 7 | 34 | 8.5515675s | 39.751208ms | 11 token(s) | 1.86826875s | 5.89 tokens/s | 10 token(s) | 6.642460458s | 1.51 tokens/s | 3.6B | 24 GB | 131072 | 131072 | 56%/44% CPU/GPU |
cogito:8b | 0 | 0 | 8.0B | 131072 | |||||||||||
deepcoder:1.5b | 8 | 41 | 532.917875ms | 28.117875ms | 4 token(s) | 258.6165ms | 15.47 tokens/s | 16 token(s) | 245.670958ms | 65.13 tokens/s | 1.8B | 8.9 GB | 131072 | 131072 | 100% GPU |
deepseek-r1:1.5b | 8 | 41 | 525.881417ms | 26.917375ms | 4 token(s) | 252.666958ms | 15.83 tokens/s | 16 token(s) | 245.607542ms | 65.14 tokens/s | 1.8B | 8.9 GB | 131072 | 131072 | 100% GPU |
deepseek-r1:14b | 8 | 41 | 8.527335208s | 26.985666ms | 4 token(s) | 6.268131875s | 0.64 tokens/s | 16 token(s) | 2.231549792s | 7.17 tokens/s | 14.8B | 34 GB | 131072 | 131072 | 100% CPU |
deepseek-r1:8b | 9 | 43 | 14.040474791s | 27.507083ms | 3 token(s) | 583.717708ms | 5.14 tokens/s | 166 token(s) | 13.428613458s | 12.36 tokens/s | 8.2B | 24 GB | 131072 | 131072 | 100% CPU |
dolphin-mistral:7b | 7 | 37 | 1.650124709s | 15.256292ms | 29 token(s) | 1.13607825s | 25.53 tokens/s | 10 token(s) | 497.67725ms | 20.09 tokens/s | 7.2B | 11 GB | 32768 | 131072 | 100% GPU |
dolphin3:8b | 0 | 0 | 8.0B | 131072 | |||||||||||
gemma3:1b | 23 | 121 | 599.863292ms | 51.481708ms | 10 token(s) | 85.871ms | 116.45 tokens/s | 33 token(s) | 461.831584ms | 71.45 tokens/s | 999.89M | 2.1 GB | 32768 | 131072 | 100% GPU |
gemma3:4b | 49 | 280 | 3.149748333s | 53.2105ms | 10 token(s) | 595.500333ms | 16.79 tokens/s | 74 token(s) | 2.500543125s | 29.59 tokens/s | 4.3B | 10 GB | 131072 | 131072 | 100% GPU |
gemma3n:e2b | 26 | 140 | 2.080287625s | 53.687709ms | 10 token(s) | 776.510375ms | 12.88 tokens/s | 38 token(s) | 1.248489292s | 30.44 tokens/s | 4.5B | 6.7 GB | 32768 | 131072 | 100% GPU |
gemma3n:e4b | 25 | 136 | 2.96821875s | 54.37525ms | 10 token(s) | 1.244132708s | 8.04 tokens/s | 38 token(s) | 1.667803167s | 22.78 tokens/s | 6.9B | 8.3 GB | 32768 | 131072 | 100% GPU |
gemma:2b | 16 | 72 | 593.817583ms | 30.766708ms | 23 token(s) | 127.855208ms | 179.89 tokens/s | 22 token(s) | 434.647834ms | 50.62 tokens/s | 2.5B | 3.0 GB | 8192 | 131072 | 100% GPU |
granite3.3:2b | 0 | 0 | 2.5B | 131072 | |||||||||||
granite3.3:8b | 25 | 140 | 7.559457166s | 18.633ms | 44 token(s) | 4.883007667s | 9.01 tokens/s | 33 token(s) | 2.657159s | 12.42 tokens/s | 8.2B | 26 GB | 131072 | 131072 | 100% CPU |
hermes3:8b | 0 | 0 | 8.0B | 131072 | |||||||||||
llama3.1:8b-instruct-q4_1 | 0 | 0 | 8.0B | 131072 | |||||||||||
llama3.2:1b | 0 | 0 | 1.2B | 131072 | |||||||||||
llama3.2:3b | 7 | 36 | 3.055374s | 31.883333ms | 26 token(s) | 2.293372833s | 11.34 tokens/s | 10 token(s) | 729.4025ms | 13.71 tokens/s | 3.2B | 24 GB | 131072 | 131072 | 56%/44% CPU/GPU |
llava-llama3:8b | 23 | 123 | 1.974579416s | 32.9305ms | 11 token(s) | 345.309958ms | 31.86 tokens/s | 27 token(s) | 1.595738417s | 16.92 tokens/s | 8.0B | 6.8 GB | 8192 | 4096 | 100% GPU |
llava-phi3:3.8b | 42 | 210 | 2.04440925s | 13.461208ms | 11 token(s) | 252.911334ms | 43.49 tokens/s | 52 token(s) | 1.777358791s | 29.26 tokens/s | 3.8B | 5.4 GB | 4096 | 4096 | 100% GPU |
llava:7b | 30 | 163 | 3.305790084s | 12.141084ms | 9 token(s) | 1.177221208s | 7.65 tokens/s | 38 token(s) | 2.115605875s | 17.96 tokens/s | 7.2B | 11 GB | 32768 | 131072 | 5%/95% CPU/GPU |
minicpm-v:8b | 19 | 98 | 1.7125485s | 26.075375ms | 9 token(s) | 362.210208ms | 24.85 tokens/s | 24 token(s) | 1.323743042s | 18.13 tokens/s | 7.6B | 9.7 GB | 32768 | 131072 | 100% GPU |
mistral:7b | 92 | 512 | 9.096838667s | 14.043208ms | 6 token(s) | 1.984108125s | 3.02 tokens/s | 135 token(s) | 7.0979365s | 19.02 tokens/s | 7.2B | 11 GB | 32768 | 131072 | 100% GPU |
mistral:7b-instruct | 56 | 300 | 4.37924725s | 13.955667ms | 5 token(s) | 365.501792ms | 13.68 tokens/s | 75 token(s) | 3.999080916s | 18.75 tokens/s | 7.2B | 11 GB | 32768 | 131072 | 100% GPU |
qwen2.5-coder:7b | 7 | 36 | 1.03598525s | 26.605042ms | 30 token(s) | 486.162167ms | 61.71 tokens/s | 10 token(s) | 522.6945ms | 19.13 tokens/s | 7.6B | 9.0 GB | 32768 | 131072 | 100% GPU |
qwen2.5vl:3b | 0 | 0 | 3.8B | 128000 | |||||||||||
qwen2.5vl:7b | 0 | 0 | 8.3B | 128000 | |||||||||||
qwen3:0.6b | 8 | 38 | 1.367429459s | 28.195209ms | 11 token(s) | 193.224ms | 56.93 tokens/s | 121 token(s) | 1.1454755s | 105.63 tokens/s | 751.63M | 7.4 GB | 40960 | 131072 | 100% GPU |
qwen3:1.7b | 8 | 41 | 1.744778875s | 27.449792ms | 11 token(s) | 283.187041ms | 38.84 tokens/s | 78 token(s) | 1.433540875s | 54.41 tokens/s | 2.0B | 8.2 GB | 40960 | 131072 | 100% GPU |
qwen3:14b | 6 | 44 | 14.8B | 22 GB | 40960 | 131072 | 48%/52% CPU/GPU | ||||||||
qwen3:4b | 8 | 41 | 8.830962417s | 26.88ms | 11 token(s) | 822.504333ms | 13.37 tokens/s | 182 token(s) | 7.980914584s | 22.80 tokens/s | 4.0B | 13 GB | 40960 | 131072 | 16%/84% CPU/GPU |
qwen3:8b | 8 | 41 | 9.142129084s | 26.981667ms | 11 token(s) | 1.740455833s | 6.32 tokens/s | 94 token(s) | 7.374071s | 12.75 tokens/s | 8.2B | 15 GB | 40960 | 131072 | 29%/71% CPU/GPU |
smollm2:1.7b | 7 | 34 | 403.90575ms | 13.537167ms | 30 token(s) | 159.675958ms | 187.88 tokens/s | 10 token(s) | 230.102125ms | 43.46 tokens/s | 1.7B | 4.7 GB | 8192 | 131072 | 100% GPU |
smollm2:135m | 7 | 31 | 120.587209ms | 16.863959ms | 31 token(s) | 45.770541ms | 677.29 tokens/s | 10 token(s) | 57.4175ms | 174.16 tokens/s | 134.52M | 1.2 GB | 8192 | 131072 | 100% GPU |
smollm2:360m | 25 | 146 | 472.410916ms | 16.542291ms | 31 token(s) | 67.566ms | 458.81 tokens/s | 33 token(s) | 387.743292ms | 85.11 tokens/s | 361.82M | 1.9 GB | 8192 | 131072 | 100% GPU |
System | |
Ollama proc | 100% GPU |
Ollama context | 131072 |
Ollama version | 0.9.7-rc0 |
Multirun timeout | 300 seconds |
Sys arch | arm64 |
Sys processor | arm |
sys memory | 6978M + 401M |
Sys OS | Darwin 24.5.0 |