Model architecture llama parameters 8.0B context length 8192 embedding length 4096 quantization Q4_K_M Capabilities completion vision Projector architecture clip parameters 311.89M embedding length 1024 dimensions 768 Parameters num_ctx 4096 num_keep 4 stop "<|start_header_id|>" stop "<|end_header_id|>" stop "<|eot_id|>"