language: | |
- en | |
tags: | |
- llama.cpp | |
- gguf | |
- aya | |
- cohere | |
- quantized | |
library_name: llama.cpp | |
pipeline_tag: text-generation | |
license: apache-2.0 | |
# Aya Sl Biz 8B | |
This is a GGUF format quantized version of a fine-tuned CohereForAI/aya-23-8B model. | |
## Model Details | |
- **Original Model:** CohereForAI/aya-23-8B | |
- **Quantization Type:** Q4_K_M | |
- **Format:** GGUF | |
- **Conversion Date:** 2024-10-31 | |
- **Framework:** llama.cpp | |
## Usage | |
This model can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp). Here's how to use it: | |
```bash | |
# Basic usage | |
./llama-cli -m path_to_model.gguf -n 512 --prompt "Your prompt here" | |
# Chat format | |
./llama-cli -m path_to_model.gguf --temp 0.7 --repeat-penalty 1.2 -n 512 --prompt "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>You are Command-R, a helpful AI assistant.<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Your prompt here<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>" | |
``` | |
## Quantization Details | |
This model was quantized using the Q4_K_M format, which offers a good balance between model size and performance. The quantization was performed using llama.cpp's quantization tools. | |
Original model size: ~16GB | |
Quantized model size: ~4.7GB | |
## License | |
This model is released under the Apache 2.0 license. | |