Sai2076

Update README.md

ff20ec8 verified 7 days ago

6.43 kB

	---
	library_name: transformers
	tags:
	- unsloth
	- qlora
	- lora
	- llama-3.2
	- instruction-tuned
	- bf16
	- 4bit
	---

	# Model Card: UAB-NLP/ProjGen_Finetuned_llama

	A LLaMA-3.2 based instruction-tuned model fine-tuned with Unsloth + QLoRA using 🤗 Transformers.
	This model is part of the ProjGen project, aimed at enhancing developer productivity through automated project generation and structured code scaffolding.

	---

	## Model Details

	### Model Description
	- Base model: `meta-llama/Llama-3.2-<SIZE>-Instruct` <!-- replace SIZE with e.g. 8B/70B -->
	- Finetuning method: Unsloth + QLoRA (LoRA adapters)
	- Precision (train): 4-bit NF4 quantization (bitsandbytes) + bf16 compute
	- Context length: 4096
	- Task(s): Instruction following & project/code generation
	- License: Inherits from Meta’s LLaMA-3.2 license
	- Developed by: UAB-NLP Group (Sai Praneeth Kumar, University of Alabama at Birmingham)
	- Finetuned from: `meta-llama/Llama-3.2-<SIZE>-Instruct`
	- Shared by: [UAB-NLP](https://huggingface.co/UAB-NLP)

	### Model Sources
	- Repository: [UAB-NLP/ProjGen_Finetuned_llama](https://huggingface.co/UAB-NLP/ProjGen_Finetuned_llama)
	- Project Paper: ProjGen – Enhanced Developer Productivity for Flask Project Generation with a RAG-Enhanced Fine-Tuned Local LLM

	---

	## Intended Uses & Limitations

	### Direct Use
	- Generating Flask/Django/Streamlit project structures automatically.
	- Instruction-following tasks related to software engineering and code generation.

	### Downstream Use
	- Further fine-tuning on domain-specific datasets (e.g., medical imaging, finance, etc.).
	- Integration into developer assistants and productivity tools.

	### Out-of-Scope / Limitations
	- Not suitable for medical, legal, or financial decision-making without human review.
	- May hallucinate or produce insecure/inefficient code if not monitored.

	---

	## Bias, Risks, and Limitations
	The model inherits risks from the base LLaMA-3.2 model:
	- Possible hallucinations and factual inaccuracies.
	- Dataset/domain biases reflected in responses.
	- Outputs should be validated before production deployment.

	Recommendation: Always pair outputs with testing, validation, and human oversight.

	---

	## Getting Started

	### Inference (PEFT adapter form)
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

	model_id = "UAB-NLP/ProjGen_Finetuned_llama"

	tok = AutoTokenizer.from_pretrained(model_id)

	bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	quantization_config=bnb,
	device_map="auto",
	torch_dtype="auto"
	)

	prompt = "Generate a Flask project with login, dashboard, and reports."
	inputs = tok(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512)
	print(tok.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## Training Details

	### Data
	- Dataset: Custom ProjGen dataset built from structured Flask/Django/Streamlit projects and instructions.
	- Size: [Fill in #samples / tokens]
	- Preprocessing: Chat-style instruction formatting (system/user/assistant), deduplication, truncation at 4096 tokens.

	### Training Procedure
	- Quantization: 4-bit NF4 + double quantization (bitsandbytes)
	- LoRA Config:
	- `r`: 16
	- `alpha`: 32
	- `dropout`: 0.05
	- Target modules: q_proj, k_proj, v_proj, o_proj, gate_up_proj, down_proj
	- Optimizer: Paged AdamW (32-bit)
	- LR / Schedule: 2e-4 with cosine decay + warmup
	- Batch size: [fill in effective batch size]
	- Epochs/Steps: 60 steps
	- Precision: bf16 mixed precision
	- Grad checkpointing: Enabled
	- Flash attention: Enabled (Unsloth optimization)

	### Training Hardware
	- GPU: RTX 4070 (12GB VRAM) [replace with actual if different]
	- Training runtime: 331.47 seconds
	- Steps per second: 0.181
	- Samples per second: 1.448
	- Final training loss: 0.4899
	- Total FLOPs: 3.8674e15 FLOPs
	- Checkpoint size: ~ (adapter size: ~200MB; merged model size depends on base LLaMA size)

	---

	## Evaluation

	### Training Loss Curve
	![Training Loss Curve](training_loss_curve.png)

	### Training Loss Table (per step)
	\| Step \| Training Loss \|
	\|------\|---------------\|
	\| 1 \| 0.9276 \|
	\| 2 \| 1.0030 \|
	\| 3 \| 1.0463 \|
	\| 4 \| 0.9592 \|
	\| 5 \| 0.9903 \|
	\| 6 \| 0.9239 \|
	\| 7 \| 0.7762 \|
	\| 8 \| 0.6905 \|
	\| 9 \| 0.6130 \|
	\| 10 \| 0.5687 \|
	\| 11 \| 0.6292 \|
	\| 12 \| 0.5927 \|
	\| 13 \| 0.5880 \|
	\| 14 \| 0.5021 \|
	\| 15 \| 0.5303 \|
	\| 16 \| 0.4216 \|
	\| 17 \| 0.4692 \|
	\| 18 \| 0.5330 \|
	\| 19 \| 0.4350 \|
	\| 20 \| 0.4003 \|
	\| 21 \| 0.3515 \|
	\| 22 \| 0.4201 \|
	\| 23 \| 0.4200 \|
	\| 24 \| 0.3666 \|
	\| 25 \| 0.4260 \|
	\| 26 \| 0.4261 \|
	\| 27 \| 0.3206 \|
	\| 28 \| 0.4385 \|
	\| 29 \| 0.3475 \|
	\| 30 \| 0.4438 \|
	\| 31 \| 0.4648 \|
	\| 32 \| 0.4088 \|
	\| 33 \| 0.4422 \|
	\| 34 \| 0.4209 \|
	\| 35 \| 0.3593 \|
	\| 36 \| 0.3433 \|
	\| 37 \| 0.3874 \|
	\| 38 \| 0.3604 \|
	\| 39 \| 0.4374 \|
	\| 40 \| 0.4048 \|
	\| 41 \| 0.3604 \|
	\| 42 \| 0.4087 \|
	\| 43 \| 0.3240 \|
	\| 44 \| 0.4375 \|
	\| 45 \| 0.4195 \|
	\| 46 \| 0.3881 \|
	\| 47 \| 0.4383 \|
	\| 48 \| 0.3506 \|
	\| 49 \| 0.4687 \|
	\| 50 \| 0.3709 \|
	\| 51 \| 0.3951 \|
	\| 52 \| 0.4012 \|
	\| 53 \| 0.4020 \|
	\| 54 \| 0.3977 \|
	\| 55 \| 0.2816 \|
	\| 56 \| 0.4136 \|
	\| 57 \| 0.4400 \|
	\| 58 \| 0.3268 \|
	\| 59 \| 0.4218 \|
	\| 60 \| 0.3629 \|

	Final averaged train loss: 0.4899

	---

	## Environmental Impact (estimate)
	- Hardware: RTX 4070 (12GB VRAM) [replace with actual]
	- Hours: ~0.09 h (331 seconds)
	- Region/Provider: [cloud/on-prem]
	- Estimated CO₂e: Use [ML CO₂ Impact](https://mlco2.github.io/impact#compute)

	---

	## Citation

	If you use this model, please cite the base model and this project:

	BibTeX (base, example):
	```bibtex
	@article{touvron2023llama,
	title={LLaMA: Open and Efficient Foundation Language Models},
	author={Touvron, Hugo and others},
	journal={arXiv preprint arXiv:XXXX.XXXXX},
	year={2023}
	}
	```

	Your work (fill in):
	```bibtex
	@misc{projgen2025,
	title = {ProjGen: Enhanced Developer Productivity for Flask Project Generation with a RAG-Enhanced Fine-Tuned Local LLM},
	author = {Sai Praneeth, Renduchinthala and UAB-NLP Group},
	year = {2025},
	howpublished = {\url{https://huggingface.co/UAB-NLP/ProjGen_Finetuned_llama}}
	}
	```

	---

	## Contact
	- Author: Sai Praneeth Kumar (UAB, UAB-NLP Group)
	- HF Profile: [UAB-NLP](https://huggingface.co/UAB-NLP)