SandLogicTechnologies
/

Fino1-8B-GGUF

Model card Files Files and versions Community

SandLogicTechnologies commited on 30 days ago

Commit

8684c51

·

verified ·

1 Parent(s): 0b9011e

Create README.md

Files changed (1) hide show

README.md +100 -0

README.md ADDED Viewed

	@@ -0,0 +1,100 @@

+---
+license: llama3.1
+datasets:
+- TheFinAI/Fino1_Reasoning_Path_FinQA
+language:
+- en
+base_model:
+- TheFinAI/Fino1-8B
+tags:
+- Llama
+- conversational
+- finance
+---
+# Fino1-8B Quantized Models
+This repository contains Q4_KM and Q5_KM quantized versions of [TheFinAI/Fino1-8B](https://huggingface.co/TheFinAI/Fino1-8B), a financial reasoning model based on Llama 3.1 8B Instruct. These quantized variants maintain the model's financial reasoning capabilities while providing significant memory and speed improvements.
+Discover our full range of quantized language models by visiting our [SandLogic Lexicon HuggingFace](https://huggingface.co/SandLogicTechnologies). To learn more about our company and services, check out our website at [SandLogic](https://www.sandlogic.com/).
+##  Model Details
+### Base Information
+- **Original Model**: Fino1-8B
+- **Quantized Versions**:
+  - Q4_KM (4-bit quantization)
+  - Q5_KM (5-bit quantization)
+- **Base Architecture**: Llama 3.1 8B Instruct
+- **Primary Focus**: Financial reasoning tasks
+- **Paper**: [arxiv.org/abs/2502.08127](https://arxiv.org/abs/2502.08127)
+## 💰 Financial Capabilities
+Both quantized versions maintain the original model's strengths in:
+- Financial mathematical reasoning
+- Structured financial question answering
+- FinQA dataset-based problems
+- Step-by-step financial calculations
+- Financial document analysis
+### Quantization Benefits
+#### Q4_KM Version
+- Model size: 4.92 GB (75% reduction)
+- Optimal for resource-constrained environments
+- Faster inference speed
+- Suitable for rapid financial calculations
+#### Q5_KM Version
+- Model size: 5.73 GB (69% reduction)
+- Better quality preservation
+- Balanced performance-size trade-off
+- Recommended for precision-critical financial applications
+## 🚀 Usage
+```bash
+pip install llama-cpp-python
+```
+Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.
+```python
+from llama_cpp import Llama
+llm = Llama(
+    model_path="model/path/",
+    verbose=False,
+    # n_gpu_layers=-1, # Uncomment to use GPU acceleration
+    # n_ctx=2048, # Uncomment to increase the context window
+)
+# Example of a reasoning task
+output = llm(
+    """Q: A company's revenue grew from $100,000 to $150,000 in one year.
+Calculate the percentage growth rate. A: """,
+    max_tokens=256,
+    stop=["Q:", "\n\n"],
+    echo=False
+)
+print(output["choices"][0]["text"])
+```
+##  Training Details
+### Original Model Training
+- **Dataset**: TheFinAI/Fino1_Reasoning_Path_FinQA
+- **Methods**: SFT (Supervised Fine-Tuning) and RF
+- **Hardware**: 4xH100 GPUs
+- **Configuration**:
+  - Batch Size: 16
+  - Learning Rate: 2e-5
+  - Epochs: 3
+  - Optimizer: AdamW