Virende commited on
Commit
f06e87f
·
verified ·
1 Parent(s): 85dc928

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +161 -3
README.md CHANGED
@@ -1,3 +1,161 @@
1
- ---
2
- license: agpl-3.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: agpl-3.0
3
+ base_model:
4
+ - meta-llama/Llama-3.1-8B-Instruct
5
+ ---
6
+
7
+ # 🧠 Solphie-1S-Foundation-Model
8
+
9
+ ![Virende](https://huggingface.co/datasets/Virende/Solphie-1S-Foundation-Model-DS/resolve/main/logo.png)
10
+
11
+
12
+ [![License](https://img.shields.io/badge/license-AGPL%20v3-blue?style=flat-square)](https://www.gnu.org/licenses/agpl-3.0.html)
13
+
14
+ ## **Overview**
15
+
16
+ The **Solphie-1S-Foundation-Model** is a fine-tuned adaptation of Meta's LLaMA 3.1 8B model, purpose-built to deliver precise, context-aware assistance for developers navigating the **Solana ecosystem**. Engineered with state-of-the-art **instruction tuning**, this model excels at:
17
+
18
+ ✅ **Answering complex Solana-related queries**
19
+ ✅ **Generating high-quality, Solana-optimized code snippets**
20
+ ✅ **Debugging smart contracts and dApps**
21
+ ✅ **Explaining technical blockchain concepts with clarity and depth**
22
+
23
+ Designed to bridge AI intelligence with blockchain development, **Solphie-1S** empowers developers to build, optimize, and scale with **on-chain knowledge** at their fingertips.
24
+
25
+ **(Knowledge cut-off date: 29th January, 2025)**
26
+
27
+ ### 🎯 **Key Features**
28
+ - Fine-tuned with **developer-first instruction tuning**, optimized for Solana workflows.
29
+ - Efficient and lightweight via **LoRA (Low-Rank Adaptation)**, ensuring scalable fine-tuning.
30
+ - **Retains context across multi-turn conversations**, enabling seamless AI-assisted development.
31
+ - Generates **complete, executable code snippets** with practical real-world examples.
32
+
33
+
34
+ ---
35
+
36
+ ## 🚀 **Model Card**
37
+
38
+ | **Parameter** | **Details** |
39
+ |----------------------------|----------------------------------------------------------------------------------------------|
40
+ | **Base Model** | Meta LLaMa 3.1 8B |
41
+ | **Fine-Tuning Framework** | HuggingFace Transformers, LoRA |
42
+ | **Dataset Size** | 13,593 high-quality Q&A pairs |
43
+ | **Context Length** | 4,096 tokens |
44
+ | **Training Steps** | 10,000 |
45
+ | **Learning Rate** | 3e-4 |
46
+ | **Batch Size** | 1 per GPU with gradient accumulation |
47
+ | **Epochs** | 2 |
48
+ | **Model Size** | 8 billion parameters (adapter size ~10 MB) |
49
+ | **Pre-trained Tasks** | Instruction following, Code generation, Debugging, Multi-turn Q&A |
50
+
51
+ ---
52
+
53
+ ## 📊 **Model Architecture**
54
+
55
+ ### **Training Workflow**
56
+ The model was fine-tuned using parameter-efficient methods with **LoRA** to adapt to the Solana-specific domain. Below is a visualization of the training process:
57
+
58
+ ```
59
+ +---------------------------+ +-------------------------+
60
+ | Base Model | --- LoRA -->| Fine-Tuned Adapter |
61
+ | LLaMa 3.1 8B | | Virende-8B-Instruct |
62
+ +---------------------------+ +-------------------------+
63
+ ```
64
+
65
+ ### **Dataset Sources**
66
+ It is built over Virende-Novel-Instruct dataset, refer to [this page](https://huggingface.co/datasets/Virende/Solphie-1S-Foundation-Model-DS) for more details.
67
+
68
+
69
+ ---
70
+
71
+ ## 🛠️ **Installation and Usage**
72
+
73
+ ### **1. Installation**
74
+
75
+ ```bash
76
+ pip install transformers datasets peft wandb
77
+ ```
78
+
79
+ ### **2. Load the Model**
80
+
81
+ ```python
82
+ from transformers import LlamaForCausalLM, AutoTokenizer
83
+
84
+ model_name = "Virende/Solphie-1S-Foundation-Model"
85
+
86
+ model = LlamaForCausalLM.from_pretrained(model_name)
87
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
88
+ ```
89
+
90
+ ### **3. Run Inference**
91
+
92
+ ```python
93
+ def complete_chat(model, tokenizer, messages, max_new_tokens=128):
94
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True, add_generation_prompt=True).to(model.device)
95
+ with torch.no_grad():
96
+ outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)
97
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
98
+
99
+ response = complete_chat(model, tokenizer, [
100
+ {"role": "system", "content": "You are Virende, a helpful assistant."},
101
+ {"role": "user", "content": "Explain how to interact with Raydium API for token swaps."}
102
+ ])
103
+ print(response)
104
+ ```
105
+
106
+
107
+
108
+ ## 📂 **Dataset**
109
+
110
+ | Split | Count | Description |
111
+ |---------|--------|--------------------------------|
112
+ | **Train** | 27.1k | High-quality Q&A pairs |
113
+
114
+ **Dataset Format (JSONL):**
115
+ ```json
116
+ {
117
+ "question": "How to use the Helius API for transaction indexing?",
118
+ "answer": "To index transactions, use Helius's Webhooks API ...",
119
+ "chunk": "Helius API allows you to set up ..."
120
+ }
121
+ ```
122
+
123
+ ---
124
+
125
+ ## 🔍 **Technical Insights**
126
+
127
+ ### **LoRA Configuration**
128
+ - Rank: 8
129
+ - Alpha: 32
130
+ - Dropout: 0.01
131
+ - Adapter Size: ~10 MB
132
+
133
+ ### **Optimization**
134
+ - Mixed Precision (FP16) for faster inference.
135
+ - Gradient Accumulation for memory efficiency.
136
+ - Parameter-efficient tuning to preserve base model knowledge.
137
+
138
+ ---
139
+
140
+ ## 🙌 **Contributing**
141
+
142
+ We welcome contributions to enhance the Solphie-1S Foundation Model. Feel free to:
143
+ - Share your feedback on the HuggingFace Model Hub.
144
+
145
+ ---
146
+
147
+ ## 📜 **License**
148
+
149
+ This model is licensed under the **GNU Affero General Public License v3.0 (AGPLv3).**
150
+
151
+ ---
152
+
153
+ ## 📞 **Community**
154
+
155
+ For questions or support, reach out via:
156
+ - **Twitter**: [SolphieAI](https://x.com/SolphieAI)
157
+ ---
158
+
159
+ ## 🤝 **Acknowledgments**
160
+
161
+ Special thanks to the Solana ecosystem developers and the open-source community for their invaluable contributions and support.