🧠 Solphie-1S-Foundation-Model

Overview

The Solphie-1S-Foundation-Model is a fine-tuned adaptation of Meta's LLaMA 3.1 8B model, purpose-built to deliver precise, context-aware assistance for developers navigating the Solana ecosystem. Engineered with state-of-the-art instruction tuning, this model excels at:

✅ Answering complex Solana-related queries
✅ Generating high-quality, Solana-optimized code snippets
✅ Debugging smart contracts and dApps
✅ Explaining technical blockchain concepts with clarity and depth

Designed to bridge AI intelligence with blockchain development, Solphie-1S empowers developers to build, optimize, and scale with on-chain knowledge at their fingertips.

(Knowledge cut-off date: 29th January, 2025)

🎯 Key Features

Fine-tuned with developer-first instruction tuning, optimized for Solana workflows.
Efficient and lightweight via LoRA (Low-Rank Adaptation), ensuring scalable fine-tuning.
Retains context across multi-turn conversations, enabling seamless AI-assisted development.
Generates complete, executable code snippets with practical real-world examples.

🚀 Model Card

Parameter	Details
Base Model	Meta LLaMa 3.1 8B
Fine-Tuning Framework	HuggingFace Transformers, LoRA
Dataset Size	13,593 high-quality Q&A pairs
Context Length	4,096 tokens
Training Steps	10,000
Learning Rate	3e-4
Batch Size	1 per GPU with gradient accumulation
Epochs	2
Model Size	8 billion parameters (adapter size ~10 MB)
Pre-trained Tasks	Instruction following, Code generation, Debugging, Multi-turn Q&A

📊 Model Architecture

Training Workflow

The model was fine-tuned using parameter-efficient methods with LoRA to adapt to the Solana-specific domain. Below is a visualization of the training process:

+---------------------------+               +-----------------------------+
|       Base Model          |  --- LoRA --> |      Fine-Tuned Adapter     |
|    LLaMa 3.1 8B           |               | Solphie-1S-Foundation-Model |
+---------------------------+               +-----------------------------+

Dataset Sources

It is built over Virende-Novel-Instruct dataset, refer to this page for more details.

🛠️ Installation and Usage

1. Installation

pip install transformers datasets peft wandb

2. Load the Model

from transformers import LlamaForCausalLM, AutoTokenizer

model_name = "Virende/Solphie-1S-Foundation-Model"

model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

3. Run Inference

def complete_chat(model, tokenizer, messages, max_new_tokens=128):
    inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True, add_generation_prompt=True).to(model.device)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

response = complete_chat(model, tokenizer, [
    {"role": "system", "content": "You are Virende, a helpful assistant."},
    {"role": "user", "content": "Explain how to interact with Raydium API for token swaps."}
])
print(response)

📂 Dataset

Split	Count	Description
Train	13.6k	High-quality Q&A pairs

Dataset Format (JSONL):

{
  "question": "How to ...",
  "answer": "...",
  "think": "..."
}

🔍 Technical Insights

LoRA Configuration

Rank: 8
Alpha: 32
Dropout: 0.01
Adapter Size: ~10 MB

Optimization

Mixed Precision (FP16) for faster inference.
Gradient Accumulation for memory efficiency.
Parameter-efficient tuning to preserve base model knowledge.

🙌 Contributing

We welcome contributions to enhance the Solphie-1S Foundation Model. Feel free to:

Share your feedback on the HuggingFace Model Hub.

📜 License

This model is licensed under the GNU Affero General Public License v3.0 (AGPLv3).

📞 Community

For questions or support, reach out via: - Twitter: SolphieAI

🤝 Acknowledgments

Special thanks to the Solana ecosystem developers and the open-source community for their invaluable contributions and support.

Virende
/

Solphie-1S-Foundation-Model