Sasak-translite-v1
🌟 Model Description
- Developed by: Tanwir
- Language : Indonesia dan Sasak
Bahasa Sasak adalah bahasa daerah yang digunakan oleh suku Sasak di Pulau Lombok, Nusa Tenggara Barat. Bahasa ini memiliki beberapa dialek utama seperti Ngeno-Ngene, Meno-Mene, dan Ngeto-Ngete yang menunjukkan keragaman budaya dan geografis penuturnya. Struktur bahasa Sasak dipengaruhi oleh bahasa Bali dan Melayu, namun memiliki kosakata, pelafalan, dan tata bahasa yang khas. Bahasa ini digunakan dalam komunikasi sehari-hari, upacara adat, serta karya sastra lisan seperti tembang dan pepaosan, sehingga menjadi bagian penting dari identitas budaya masyarakat Lombok.
- Sasak to Indonesia Transliteration - Translating Sasak text to Indonesia
- Indonesia to Sasak Translation - Translating Indonesia text to Sasak
📊 Training
📊 Model Specifications
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-7B-Instruct |
| Fine-tuning Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank | 8 |
| LoRA Alpha | 16 |
| **LoRA+ LR ratio | 8 |
| Sequence Length | 512 tokens |
| Training Epochs | 3 |
| Learning Rate | 5e-5 |
| Batch Size | 2 (micro) × 4 (gradient accumulation) |
| Optimizer | AdamW 8-bit |
| Precision | bfloat16 |
🛠️ Usage Examples
1. Indonesia to Sasak Transliteration
Pastikan untuk memperbarui instalasi transformer Anda melalui pip install --upgrade transformer.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Prepare the conversation
messages = [
{
"content": "You are a Sasak language translation expert. Translate the given Indonesian text into the Sasak language.",
"role": "system"
},
{
"content": "Translate this Indonesian text to Sasak: aturan",
"role": "user"
}
]
# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham |
2. Sasak to Indonesia Transliteration
Pastikan untuk memperbarui instalasi transformer Anda melalui pip install --upgrade transformer.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Prepare the conversation
messages = [
{
"content": "You are an Indonesian translation expert. Translate the given Sasak text into Indonesian language.",
"role": "system"
},
{
"content": "Translate this Sasak text to Indonesian: titi tate",
"role": "user"
}
]
# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham |
- Downloads last month
- 36
