GGUF
English
Llama
conversational
finance
SandLogicTechnologies commited on
Commit
8684c51
·
verified ·
1 Parent(s): 0b9011e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +100 -0
README.md ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.1
3
+ datasets:
4
+ - TheFinAI/Fino1_Reasoning_Path_FinQA
5
+ language:
6
+ - en
7
+ base_model:
8
+ - TheFinAI/Fino1-8B
9
+ tags:
10
+ - Llama
11
+ - conversational
12
+ - finance
13
+ ---
14
+ # Fino1-8B Quantized Models
15
+
16
+ This repository contains Q4_KM and Q5_KM quantized versions of [TheFinAI/Fino1-8B](https://huggingface.co/TheFinAI/Fino1-8B), a financial reasoning model based on Llama 3.1 8B Instruct. These quantized variants maintain the model's financial reasoning capabilities while providing significant memory and speed improvements.
17
+
18
+ Discover our full range of quantized language models by visiting our [SandLogic Lexicon HuggingFace](https://huggingface.co/SandLogicTechnologies). To learn more about our company and services, check out our website at [SandLogic](https://www.sandlogic.com/).
19
+
20
+ ## Model Details
21
+
22
+ ### Base Information
23
+ - **Original Model**: Fino1-8B
24
+ - **Quantized Versions**:
25
+ - Q4_KM (4-bit quantization)
26
+ - Q5_KM (5-bit quantization)
27
+ - **Base Architecture**: Llama 3.1 8B Instruct
28
+ - **Primary Focus**: Financial reasoning tasks
29
+ - **Paper**: [arxiv.org/abs/2502.08127](https://arxiv.org/abs/2502.08127)
30
+
31
+
32
+ ## 💰 Financial Capabilities
33
+
34
+ Both quantized versions maintain the original model's strengths in:
35
+ - Financial mathematical reasoning
36
+ - Structured financial question answering
37
+ - FinQA dataset-based problems
38
+ - Step-by-step financial calculations
39
+ - Financial document analysis
40
+ ### Quantization Benefits
41
+
42
+ #### Q4_KM Version
43
+ - Model size: 4.92 GB (75% reduction)
44
+ - Optimal for resource-constrained environments
45
+ - Faster inference speed
46
+ - Suitable for rapid financial calculations
47
+
48
+ #### Q5_KM Version
49
+ - Model size: 5.73 GB (69% reduction)
50
+ - Better quality preservation
51
+ - Balanced performance-size trade-off
52
+ - Recommended for precision-critical financial applications
53
+
54
+
55
+
56
+
57
+ ## 🚀 Usage
58
+ ```bash
59
+ pip install llama-cpp-python
60
+ ```
61
+ Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.
62
+
63
+
64
+ ```python
65
+ from llama_cpp import Llama
66
+
67
+ llm = Llama(
68
+ model_path="model/path/",
69
+ verbose=False,
70
+ # n_gpu_layers=-1, # Uncomment to use GPU acceleration
71
+ # n_ctx=2048, # Uncomment to increase the context window
72
+ )
73
+
74
+ # Example of a reasoning task
75
+ output = llm(
76
+ """Q: A company's revenue grew from $100,000 to $150,000 in one year.
77
+ Calculate the percentage growth rate. A: """,
78
+ max_tokens=256,
79
+ stop=["Q:", "\n\n"],
80
+ echo=False
81
+ )
82
+
83
+ print(output["choices"][0]["text"])
84
+
85
+ ```
86
+
87
+ ## Training Details
88
+
89
+ ### Original Model Training
90
+ - **Dataset**: TheFinAI/Fino1_Reasoning_Path_FinQA
91
+ - **Methods**: SFT (Supervised Fine-Tuning) and RF
92
+ - **Hardware**: 4xH100 GPUs
93
+ - **Configuration**:
94
+ - Batch Size: 16
95
+ - Learning Rate: 2e-5
96
+ - Epochs: 3
97
+ - Optimizer: AdamW
98
+
99
+
100
+