Prompt: (raw) (yaml)
words:2929 bytes:22955
Model | Response words |
Response bytes |
Total duration |
Load duration |
Prompt eval count |
Prompt eval duration |
Prompt eval rate |
Eval count |
Eval duration |
Eval rate |
Model params |
Model size |
Model context |
Ollama context |
Ollama proc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
deepseek-r1:8b | 681 | 5861 | 2m12.2151384s | 1.472599s | 7026 token(s) | 7.8699678s | 892.76 tokens/s | 1540 token(s) | 2m2.8720679s | 12.53 tokens/s | 8.2B | 9.6 GB | 131072 | 16384 | 21%/79% CPU/GPU |
dolphin3:8b | 443 | 2935 | 37.8523967s | 2.7623039s | 7010 token(s) | 6.8261909s | 1026.93 tokens/s | 548 token(s) | 28.262826s | 19.39 tokens/s | 8.0B | 8.7 GB | 131072 | 16384 | 15%/85% CPU/GPU |
gemma3n:e2b | 6 | 45 | 4.5B | 5.3 GB | 32768 | 16384 | 100% GPU | ||||||||
gemma3n:e4b | 846 | 6081 | 2m20.7575329s | 5.0137028s | 8832 token(s) | 1m2.1224277s | 142.17 tokens/s | 1439 token(s) | 1m13.620899s | 19.55 tokens/s | 6.9B | 8.5 GB | 32768 | 16384 | 12%/88% CPU/GPU |
hf.co/bartowski/Ministral-8B-Instruct-2410-GGUF:IQ4_XS | 586 | 4265 | 54.5143728s | 2.8870168s | 7399 token(s) | 7.3349557s | 1008.73 tokens/s | 925 token(s) | 44.2907298s | 20.88 tokens/s | 8.0B | 8.4 GB | 32768 | 16384 | 11%/89% CPU/GPU |
hf.co/bartowski/Ministral-8B-Instruct-2410-GGUF:Q4_K_M | 676 | 5077 | 1m10.820365s | 3.0225515s | 7399 token(s) | 7.7777565s | 951.30 tokens/s | 1084 token(s) | 1m0.0185233s | 18.06 tokens/s | 8.0B | 8.9 GB | 32768 | 16384 | 15%/85% CPU/GPU |
hf.co/bartowski/Ministral-8B-Instruct-2410-GGUF:Q5_K_M | 780 | 5944 | 1m46.6483658s | 3.5466358s | 7399 token(s) | 8.9233639s | 829.17 tokens/s | 1287 token(s) | 1m34.1773179s | 13.67 tokens/s | 8.0B | 9.6 GB | 32768 | 16384 | 24%/76% CPU/GPU |
hf.co/bartowski/Ministral-8B-Instruct-2410-GGUF:Q6_K | 660 | 5033 | 2m2.5748092s | 4.0715534s | 7399 token(s) | 9.9001455s | 747.36 tokens/s | 1242 token(s) | 1m48.6026064s | 11.44 tokens/s | 8.0B | 10 GB | 32768 | 16384 | 28%/72% CPU/GPU |
hf.co/bartowski/Ministral-8B-Instruct-2410-GGUF:Q6_K_L | 574 | 4766 | 2m16.7115054s | 4.3825356s | 7399 token(s) | 9.9598706s | 742.88 tokens/s | 1251 token(s) | 2m2.3685919s | 10.22 tokens/s | 8.0B | 10 GB | 32768 | 16384 | 29%/71% CPU/GPU |
mistral:7b | 396 | 2666 | 40.3021153s | 2.3259723s | 8692 token(s) | 9.1775545s | 947.09 tokens/s | 606 token(s) | 28.7965398s | 21.04 tokens/s | 7.2B | 8.3 GB | 32768 | 16384 | 11%/89% CPU/GPU |
qwen2.5vl:7b | 1998 | 14588 | 2m33.3162403s | 2.6380063s | 7044 token(s) | 5.306875s | 1327.33 tokens/s | 3967 token(s) | 2m25.370809s | 27.29 tokens/s | 8.3B | 10 GB | 128000 | 16384 | 34%/66% CPU/GPU |
qwen3:8b | 730 | 5208 | 5m48.561762s | 3.5663221s | 7034 token(s) | 8.2277962s | 854.91 tokens/s | 3674 token(s) | 5m36.766596s | 10.91 tokens/s | 8.2B | 9.6 GB | 40960 | 16384 | 23%/77% CPU/GPU |
System | |
Ollama proc | 23%/77% CPU/GPU |
Ollama context | 16384 |
Ollama version | 0.10.1 |
Multirun timeout | 1200 seconds |
Sys arch | x86_64 |
Sys processor | unknown |
sys memory | 11G + 19G |
Sys OS | CYGWIN_NT-10.0-22631 3.6.4-1.x86_64 |