๐ CengBot - รukurova University Computer Engineering Assistant
Author: naholav
Date: August 2025
A bilingual (Turkish/English) AI assistant specifically trained for รukurova University Computer Engineering Department students using LLaMA 3.2 3B with LoRA fine-tuning.
๐ Model Details
Developed by: naholav
Model type: Causal Language Model (LLaMA 3.2 3B + LoRA)
Language(s): Turkish, English
License: Apache-2.0
Finetuned from: meta-llama/Llama-3.2-3B
Library: PEFT 0.15.2
Dataset: Naholav/cukurova_university_chatbot
๐ Project Links
- GitHub Repository: https://github.com/naholav/cengbot
- Model on HuggingFace: https://huggingface.co/Naholav/cengbot-lora-tr-en-cukurova
- Dataset on HuggingFace: https://huggingface.co/datasets/Naholav/cukurova_university_chatbot
๐ค Advanced Inference Script
We provide a carefully optimized standalone inference script (cengbot_inference.py) that incorporates extensive prompt engineering and parameter tuning developed through iterative testing.
๐ง Script Features
Dual Prompt Engineering:
- Turkish Prompt: Specifically crafted system prompt with strict response guidelines (max 4 sentences)
- English Prompt: Parallel optimized prompt maintaining consistency across languages
- Language Detection: Multi-layer detection using Turkish characters, common words, and langdetect library fallback
Optimized Generation Parameters:
# Carefully tuned through extensive testing
temperature = 0.7 # Balanced creativity vs consistency
max_new_tokens = 200 # Sufficient for detailed answers
repetition_penalty = 1.1 # Prevents repetitive responses
top_p = 0.95 # Nucleus sampling for quality
top_k = 50 # Top-k filtering for coherence
Smart Response Control:
- Greeting Detection: Automatically reduces token usage for simple greetings (30 tokens vs 100)
- Response Limiting: Enforces maximum 4 sentences with intelligent sentence splitting
- Question Limiting: Restricts model to ask maximum 1 question per response
- Language-Specific Processing: Different response extraction logic for Turkish vs English
Prompt Engineering Process: The system prompts were developed through extensive trial-and-error testing to:
- Maintain university-specific context
- Ensure concise but helpful responses
- Balance friendliness with professionalism
- Optimize for educational support scenarios
- Handle both casual questions and technical inquiries
System Prompt Examples:
Turkish System Prompt (Optimized):
Sen รukurova รniversitesi Bilgisayar Mรผhendisliฤi bรถlรผmรผnรผn deneyimli dijital asistanฤฑsฤฑn.
รฤrencilere samimi, yardฤฑmsever ve doฤru bilgiler vererek destek oluyorsun.
รnemli kurallar:
- Her soruya MAKSฤฐMUM 4 CรMLE ile yanฤฑt ver
- Cevaplarฤฑn รงok kฤฑsa, net ve anlaลฤฑlฤฑr olmalฤฑ
- Gereksiz aรงฤฑklamalardan ve tekrarlardan kesinlikle kaรงฤฑn
- Selamlama mesajlarฤฑna tek cรผmlelik karลฤฑlฤฑk ver
- Sadece sorulan soruya odaklan, ekstra bilgi verme
- Cevaplarฤฑnda MAKSฤฐMUM 1 adet soru sorabilirsin
English System Prompt (Parallel):
You are an experienced digital assistant for รukurova University Computer Engineering Department.
You help students by providing friendly, helpful and accurate information.
Important rules:
- Answer each question with MAXIMUM 4 SENTENCES
- Keep your answers very short, clear and understandable
- Absolutely avoid unnecessary explanations and repetitions
- Reply to greeting messages with a single sentence
- Focus only on the asked question, don't provide extra information
- You can ask MAXIMUM 1 question in your answers
๐ How to Get Started
Direct Use:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B")
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Naholav/cengbot-lora-tr-en-cukurova")
# Generate response
inputs = tokenizer("Merhaba, bilgisayar mรผhendisliฤi hakkฤฑnda bilgi verebilir misin?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Simple Setup:
- Install dependencies:
pip install torch transformers peft langdetect
- Download and run:
wget https://raw.githubusercontent.com/naholav/cengbot/main/cengbot_inference.py
python cengbot_inference.py
That's it! The model will be downloaded automatically from HuggingFace.
๐ฏ Features
- Dual Language Support: Automatically detects Turkish or English input using langdetect
- Dual Prompt System: Separate optimized prompts for Turkish and English responses
- Dual Language Validation: Separate TR/EN validation during training (TR Loss: 0.6750, EN Loss: 0.7387)
- University-Specific: Trained on รukurova University Computer Engineering data
- Concise Responses: Maximum 4 sentences per response with smart length control
- Smart Greeting Detection: Optimized token usage for greetings vs detailed questions
- Language Detection: Advanced Turkish character and word detection fallbacks
- Easy Setup: Just copy and run the inference script
๐ Model Details
Architecture:
- Base Model: meta-llama/Llama-3.2-3B (3.2B parameters, 28 layers, 3072 hidden size)
- Model Type: LLaMA (torch.bfloat16)
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Config: r=16, alpha=32, dropout=0.1
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Specifications:
- Languages: Turkish, English (perfectly balanced bilingual training)
- Domain: Computer Engineering Education (รukurova University specific)
- Response Style: Concise, helpful, friendly (maximum 4 sentences)
- Prompt Engineering: Dual system prompts with language-specific optimization
- Language Detection: Advanced Turkish character + word detection with langdetect fallback
- Generation Parameters: temp=0.7, top_p=0.95, top_k=50, rep_penalty=1.1
Hardware & Performance:
- Training Hardware: RTX 5090 32GB (90% utilization), 128GB RAM, AMD Ryzen 9 9950X
- Best Validation: 0.6920 overall, 0.6636 TR, 0.7268 EN
- Inference Speed: Optimized for real-time chat with greeting detection
๐ป System Requirements
For Inference:
- GPU: CUDA-compatible (8GB+ VRAM recommended)
- RAM: 8GB minimum, 16GB recommended
- Python: 3.8+
- Internet: For initial model download
๐ Usage Examples
Turkish:
๐ค You: Merhaba, bilgisayar mรผhendisliฤi hakkฤฑnda bilgi verebilir misin?
๐ค CengBot: Merhaba! Bilgisayar mรผhendisliฤi, yazฤฑlฤฑm geliลtirme, sistem tasarฤฑmฤฑ ve teknoloji รงรถzรผmleri รผzerine odaklanan bir disiplindir. รukurova รniversitesi'nde gรผรงlรผ bir mรผfredatฤฑmฤฑz var. Hangi konuda detay istiyorsun?
English:
๐ค You: What programming languages should I learn first?
๐ค CengBot: I recommend starting with Python for its simplicity and versatility, then moving to Java or C++ for fundamental concepts. These languages provide a solid foundation for computer engineering. Which area interests you most - web development, systems programming, or data science?
๐ง Training Performance
Dataset Information:
- Total Examples: 22,524 (11,187 Turkish, 11,337 English)
- Training Set: 20,948 examples
- Validation Sets: 783 Turkish, 793 English (7% split each)
Training Progress:
- Total Steps: 1170/1312 (Early stopping activated at step 1170)
- Best Eval Loss: 0.6920 (Step 975) โ Model saved from best checkpoint
- Final Train Loss: 0.5104 (continuous improvement)
- Early Stopping: Patience=3, triggered due to eval loss increase
- Training Time: ~3 hours on RTX 5090
Best Model Performance (Step 975):
- Overall Eval Loss: 0.6920
- Turkish Validation Loss: 0.6636
- English Validation Loss: 0.7268
- Average Loss: 0.6952
Training Features:
- Dual Language Evaluation: Separate TR/EN validation tracking every step
- Early Stopping: Patience=3 prevents overfitting, saves best model
- Advanced Optimization: Mixed precision (BF16), gradient checkpointing
- LoRA Configuration: r=16, alpha=32, dropout=0.1, targets all linear layers
- Hardware Optimization: RTX 5090 32GB (90% VRAM usage), 128GB RAM, 16-core CPU
- Memory Management: Pre-cached datasets in RAM, persistent workers
Prompt Engineering & Data Processing:
- Dual Template System: Separate training templates for Turkish and English
- Response Masking: Only LoRA adapter learns from assistant responses, not user questions
- Template Consistency: Identical structure between training and inference prompts
- Balanced Dataset: Perfectly balanced Turkish (11,187) and English (11,337) examples
- Response Quality Control: Training data filtered for concise, educational responses
โ ๏ธ Limitations and Bias
- Domain Specific: Optimized for computer engineering education
- Response Length: Limited to maximum 4 sentences for conciseness
- University Context: Trained specifically on รukurova University data
- Language Balance: Balanced TR/EN training but may show slight performance differences
๐ง Technical Specifications
Hardware: RTX 5090 32GB, 128GB RAM, AMD Ryzen 9 9950X
Training Time: ~3 hours
Framework: PEFT 0.15.2, Transformers, PyTorch
Optimization: Mixed precision (BF16), gradient checkpointing
๐ Environmental Impact
Hardware Type: NVIDIA RTX 5090
Hours Used: ~3 hours
Compute Region: Local training setup
Carbon Emissions: Minimal due to efficient LoRA training
๐ค Contributing
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
๐ Contact
- Issues: https://github.com/naholav/cengbot/issues
- Author: naholav
๐ Citation
@misc{cengbot2025,
title={CengBot: A Bilingual AI Assistant for Computer Engineering Education},
author={naholav},
year={2025},
publisher={HuggingFace},
url={https://huggingface.co/Naholav/cengbot-lora-tr-en-cukurova}
}
Happy learning! ๐
- Downloads last month
- 1
Model tree for Naholav/cengbot-lora-tr-en-cukurova
Base model
meta-llama/Llama-3.2-3B