YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Gemma-2B Fine-tuned on Atlaset
This model is a fine-tuned version of google/gemma-2b on the Atlaset dataset. It improves upon the base model by leveraging domain-specific knowledge from the Atlaset corpus.
Model Details
- Base Model: google/gemma-2b
- Fine-tuning Method: Low-Rank Adaptation (LoRA)
- Training Hardware: 2x T4 GPUs on Kaggle
- Context Length: 256 tokens
- Parameters: 2B (base model) + LoRA parameters
Training Details
- LoRA Configuration:
- Rank: 16
- Alpha: 32
- Dropout: 0.05
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Training Steps: 5000
- Batch Size: 4 per device
- Learning Rate: 3e-4
- Weight Decay: 0.01
- Optimizer: AdamW
- Precision: bfloat16
Performance
This model shows improved performance on tasks related to the domains covered in the Atlaset dataset, with particular strength in:
- Knowledge-intensive tasks
- Context-aware reasoning
- Structured response generation
Usage Example
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Yamemaru/gemma-2b-finetuned-atlaset")
model = AutoModelForCausalLM.from_pretrained("Yamemaru/gemma-2b-finetuned-atlaset")
# Tokenize input
input_text = "Write a summary about machine learning"
inputs = tokenizer(input_text, return_tensors="pt")
# Generate text
outputs = model.generate(
inputs.input_ids,
max_length=512,
temperature=0.7,
top_p=0.9,
top_k=50,
repetition_penalty=1.1,
do_sample=True
)
# Decode and print the response
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.