Bangla-Llama-3.2-3B-Instruct-QA-v2

Bengali Question-Answering Model | Fine-tuned on Llama-3 Architecture | Version 2

Model Description

This model is optimized for question-answering in the Bengali language. It is fine-tuned using Llama-3-3B architecture with Unsloth. The model is trained on a context-aware instruct dataset, designed to generate accurate and relevant responses.

How to Use

Required Libraries

pip install transformers torch accelerate

Code Example

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

model_name = "Kowshik24/Bangla-llama-3.2-3B-Instruct-QA-v2"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Setting up system and user prompts
messages = [
    {
        "role": "system", 
        "content": "১৯৫২ সালের ২১ ফেব্রুয়ারি বাংলা ভাষাকে পাকিস্তানের রাষ্ট্রভাষা হিসেবে স্বীকৃতি দেওয়ার দাবিতে ঢাকা বিশ্ববিদ্যালয়ের ছাত্ররা বিক্ষোভ করে। পুলিশের গুলিতে শহিদ হন রফিক, সালাম, বরকতসহ অনেকে। এই আন্দোলনের ফলস্বরূপ ১৯৫৬ সালে বাংলা রাষ্ট্রভাষার মর্যাদা পায় এবং পরবর্তীতে UNESCO ১৯৯৯ সালে ২১ ফেব্রুয়ারিকে আন্তর্জাতিক মাতৃভাষা দিবস ঘোষণা করে।"
    },
    {
        "role": "user", 
        "content": "ভাষা আন্দোলনের দিনটি কোন তারিখে পালিত হয়?"
    },
]

# Processing chat template
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generating the answer
outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    temperature=0.01,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)

# Decoding the output
full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)
answer = full_response.split("assistant\n\n")[-1].strip()
print("Answer:", answer)

Output

Answer: ২১ ফেব্রুয়ারি

Hyperparameters

Parameter	Value	Explanation
`temperature`	0.01	Low creativity (deterministic)
`max_new_tokens`	256	Maximum output length
`torch_dtype`	bfloat16	Memory optimization

Training Details

Architecture: Llama-3-3B Instruct
Fine-tuning: Unsloth (4-bit QLoRA)

Use Cases

Educational tools
Bengali chatbots
Documentation Q&A
Journalism research

Limitations

Cannot support long contexts (more than 4K tokens)

Ethical AI

This model is designed following ethical guidelines. It should not be used to generate harmful content.

Citation

If this model helps you in your work, please cite it as follows:

@software{BanglaLlama3QA,
  author = {Kowshik},
  title = {Bangla-Llama-3.2-3B-Instruct-QA-v2},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Kowshik24/Bangla-llama-3.2-3B-Instruct-QA-v2}
}

Contact

For questions or suggestions, email: [email protected]