Model architecture llama parameters 7.2B context length 32768 embedding length 4096 quantization Q4_0 Capabilities completion vision Projector architecture clip parameters 311.89M embedding length 1024 dimensions 768 Parameters stop "" stop "USER:" num_ctx 4096