Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: llama3.1
|
3 |
+
datasets:
|
4 |
+
- TheFinAI/Fino1_Reasoning_Path_FinQA
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
base_model:
|
8 |
+
- TheFinAI/Fino1-8B
|
9 |
+
tags:
|
10 |
+
- Llama
|
11 |
+
- conversational
|
12 |
+
- finance
|
13 |
+
---
|
14 |
+
# Fino1-8B Quantized Models
|
15 |
+
|
16 |
+
This repository contains Q4_KM and Q5_KM quantized versions of [TheFinAI/Fino1-8B](https://huggingface.co/TheFinAI/Fino1-8B), a financial reasoning model based on Llama 3.1 8B Instruct. These quantized variants maintain the model's financial reasoning capabilities while providing significant memory and speed improvements.
|
17 |
+
|
18 |
+
Discover our full range of quantized language models by visiting our [SandLogic Lexicon HuggingFace](https://huggingface.co/SandLogicTechnologies). To learn more about our company and services, check out our website at [SandLogic](https://www.sandlogic.com/).
|
19 |
+
|
20 |
+
## Model Details
|
21 |
+
|
22 |
+
### Base Information
|
23 |
+
- **Original Model**: Fino1-8B
|
24 |
+
- **Quantized Versions**:
|
25 |
+
- Q4_KM (4-bit quantization)
|
26 |
+
- Q5_KM (5-bit quantization)
|
27 |
+
- **Base Architecture**: Llama 3.1 8B Instruct
|
28 |
+
- **Primary Focus**: Financial reasoning tasks
|
29 |
+
- **Paper**: [arxiv.org/abs/2502.08127](https://arxiv.org/abs/2502.08127)
|
30 |
+
|
31 |
+
|
32 |
+
## 💰 Financial Capabilities
|
33 |
+
|
34 |
+
Both quantized versions maintain the original model's strengths in:
|
35 |
+
- Financial mathematical reasoning
|
36 |
+
- Structured financial question answering
|
37 |
+
- FinQA dataset-based problems
|
38 |
+
- Step-by-step financial calculations
|
39 |
+
- Financial document analysis
|
40 |
+
### Quantization Benefits
|
41 |
+
|
42 |
+
#### Q4_KM Version
|
43 |
+
- Model size: 4.92 GB (75% reduction)
|
44 |
+
- Optimal for resource-constrained environments
|
45 |
+
- Faster inference speed
|
46 |
+
- Suitable for rapid financial calculations
|
47 |
+
|
48 |
+
#### Q5_KM Version
|
49 |
+
- Model size: 5.73 GB (69% reduction)
|
50 |
+
- Better quality preservation
|
51 |
+
- Balanced performance-size trade-off
|
52 |
+
- Recommended for precision-critical financial applications
|
53 |
+
|
54 |
+
|
55 |
+
|
56 |
+
|
57 |
+
## 🚀 Usage
|
58 |
+
```bash
|
59 |
+
pip install llama-cpp-python
|
60 |
+
```
|
61 |
+
Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.
|
62 |
+
|
63 |
+
|
64 |
+
```python
|
65 |
+
from llama_cpp import Llama
|
66 |
+
|
67 |
+
llm = Llama(
|
68 |
+
model_path="model/path/",
|
69 |
+
verbose=False,
|
70 |
+
# n_gpu_layers=-1, # Uncomment to use GPU acceleration
|
71 |
+
# n_ctx=2048, # Uncomment to increase the context window
|
72 |
+
)
|
73 |
+
|
74 |
+
# Example of a reasoning task
|
75 |
+
output = llm(
|
76 |
+
"""Q: A company's revenue grew from $100,000 to $150,000 in one year.
|
77 |
+
Calculate the percentage growth rate. A: """,
|
78 |
+
max_tokens=256,
|
79 |
+
stop=["Q:", "\n\n"],
|
80 |
+
echo=False
|
81 |
+
)
|
82 |
+
|
83 |
+
print(output["choices"][0]["text"])
|
84 |
+
|
85 |
+
```
|
86 |
+
|
87 |
+
## Training Details
|
88 |
+
|
89 |
+
### Original Model Training
|
90 |
+
- **Dataset**: TheFinAI/Fino1_Reasoning_Path_FinQA
|
91 |
+
- **Methods**: SFT (Supervised Fine-Tuning) and RF
|
92 |
+
- **Hardware**: 4xH100 GPUs
|
93 |
+
- **Configuration**:
|
94 |
+
- Batch Size: 16
|
95 |
+
- Learning Rate: 2e-5
|
96 |
+
- Epochs: 3
|
97 |
+
- Optimizer: AdamW
|
98 |
+
|
99 |
+
|
100 |
+
|