|
|
|
--- |
|
|
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- llm |
|
- fine-tune |
|
- qlora |
|
- llama |
|
- bitcoin |
|
- finance |
|
pipeline_tag: text-generation |
|
base_model: meta-llama/Llama-3.2-3B-Instruct |
|
datasets: |
|
- tahamajs/bitcoin-llm-finetuning-dataset |
|
--- |
|
``` |
|
|
|
### π Overview |
|
|
|
This model, `llama-3.2-3b-instruct-bitcoin-analyst_best`, is a fine-tuned version of the **Llama-3.2-3B-Instruct** large language model. It has been specialized for the domain of **Bitcoin analysis and cryptocurrency**. The goal of this fine-tuning was to enhance the model's ability to provide detailed, accurate, and contextually relevant information about Bitcoin, blockchain technology, market trends, and related topics, acting as a virtual Bitcoin analyst. |
|
|
|
The fine-tuning was performed using **QLoRA** on the `tahamajs/bitcoin-llm-finetuning-dataset` dataset. |
|
|
|
### π Usage |
|
|
|
You can easily use this model with the `transformers` library. The fine-tuned weights are stored as a PEFT adapter. |
|
|
|
```python |
|
import torch |
|
from peft import PeftModel |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load the base model |
|
base_model_id = "meta-llama/Llama-3.2-3B-Instruct" |
|
tokenizer = AutoTokenizer.from_pretrained(base_model_id) |
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
base_model_id, |
|
device_map="auto", |
|
torch_dtype=torch.bfloat16, |
|
) |
|
|
|
# Load the fine-tuned adapter |
|
peft_model_id = "tahamajs/llama-3.2-3b-instruct-bitcoin-analyst_best" |
|
model = PeftModel.from_pretrained(base_model, peft_model_id) |
|
|
|
# Example inference |
|
prompt = "What are the key differences between Bitcoin and Ethereum?" |
|
messages = [ |
|
{"role": "user", "content": prompt} |
|
] |
|
input_ids = tokenizer.apply_chat_template( |
|
messages, |
|
add_generation_prompt=True, |
|
return_tensors="pt" |
|
).to(model.device) |
|
|
|
outputs = model.generate(input_ids=input_ids, max_new_tokens=256) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
### π» Training Details |
|
|
|
This section provides an overview of the fine-tuning process. |
|
|
|
* **Base Model:** `meta-llama/Llama-3.2-3B-Instruct` |
|
* **Dataset:** `tahamajs/bitcoin-llm-finetuning-dataset` |
|
* **Fine-Tuning Method:** QLoRA (Quantized Low-Rank Adaptation) |
|
* **Training Framework:** `trl.SFTTrainer` |
|
* **Hardware:** [E.g., NVIDIA RTX 4070, 16GB VRAM] |
|
* **Software Stack:** PyTorch, Transformers, TRL, PEFT, BitsAndBytes |
|
|
|
#### βοΈ Hyperparameters |
|
|
|
The following hyperparameters were used for fine-tuning: |
|
|
|
| Hyperparameter | Value | |
|
| :-------------------------- | :------------------------- | |
|
| `num_train_epochs` | 1 | |
|
| `per_device_train_batch_size` | 1 | |
|
| `gradient_accumulation_steps` | 2 | |
|
| `learning_rate` | 2e-4 | |
|
| `optim` | `paged_adamw_32bit` | |
|
| `bf16` | `True` | |
|
| `max_grad_norm` | 0.3 | |
|
| `r` (LoRA rank) | 16 | |
|
| `lora_alpha` | 16 | |
|
|
|
### β οΈ Limitations and Biases |
|
|
|
As a model fine-tuned on a specific dataset, it may have the following limitations: |
|
|
|
* **Domain Specificity:** The model's knowledge is primarily focused on Bitcoin and cryptocurrency. It may perform less effectively on general knowledge tasks. |
|
* **Data Cutoff:** The model's knowledge is limited to the data it was trained on. It may not be aware of events, market changes, or new developments that occurred after the dataset's creation. |
|
* **Potential Biases:** The model's responses may reflect biases present in the training data. |
|
|
|
### π License |
|
|
|
This model is licensed under the Apache 2.0 license, inherited from its base model. |