Model Card: UAB-NLP/ProjGen_Finetuned_llama

A LLaMA-3.2 based instruction-tuned model fine-tuned with Unsloth + QLoRA using 🤗 Transformers.
This model is part of the ProjGen project, aimed at enhancing developer productivity through automated project generation and structured code scaffolding.


Model Details

Model Description

  • Base model: meta-llama/Llama-3.2-<SIZE>-Instruct
  • Finetuning method: Unsloth + QLoRA (LoRA adapters)
  • Precision (train): 4-bit NF4 quantization (bitsandbytes) + bf16 compute
  • Context length: 4096
  • Task(s): Instruction following & project/code generation
  • License: Inherits from Meta’s LLaMA-3.2 license
  • Developed by: UAB-NLP Group (Sai Praneeth Kumar, University of Alabama at Birmingham)
  • Finetuned from: meta-llama/Llama-3.2-<SIZE>-Instruct
  • Shared by: UAB-NLP

Model Sources

  • Repository: UAB-NLP/ProjGen_Finetuned_llama
  • Project Paper: ProjGen – Enhanced Developer Productivity for Flask Project Generation with a RAG-Enhanced Fine-Tuned Local LLM

Intended Uses & Limitations

Direct Use

  • Generating Flask/Django/Streamlit project structures automatically.
  • Instruction-following tasks related to software engineering and code generation.

Downstream Use

  • Further fine-tuning on domain-specific datasets (e.g., medical imaging, finance, etc.).
  • Integration into developer assistants and productivity tools.

Out-of-Scope / Limitations

  • Not suitable for medical, legal, or financial decision-making without human review.
  • May hallucinate or produce insecure/inefficient code if not monitored.

Bias, Risks, and Limitations

The model inherits risks from the base LLaMA-3.2 model:

  • Possible hallucinations and factual inaccuracies.
  • Dataset/domain biases reflected in responses.
  • Outputs should be validated before production deployment.

Recommendation: Always pair outputs with testing, validation, and human oversight.


Getting Started

Inference (PEFT adapter form)

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id = "UAB-NLP/ProjGen_Finetuned_llama"

tok = AutoTokenizer.from_pretrained(model_id)

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb,
    device_map="auto",
    torch_dtype="auto"
)

prompt = "Generate a Flask project with login, dashboard, and reports."
inputs = tok(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tok.decode(outputs[0], skip_special_tokens=True))

Training Details

Data

  • Dataset: Custom ProjGen dataset built from structured Flask/Django/Streamlit projects and instructions.
  • Size: [Fill in #samples / tokens]
  • Preprocessing: Chat-style instruction formatting (system/user/assistant), deduplication, truncation at 4096 tokens.

Training Procedure

  • Quantization: 4-bit NF4 + double quantization (bitsandbytes)
  • LoRA Config:
    • r: 16
    • alpha: 32
    • dropout: 0.05
    • Target modules: q_proj, k_proj, v_proj, o_proj, gate_up_proj, down_proj
  • Optimizer: Paged AdamW (32-bit)
  • LR / Schedule: 2e-4 with cosine decay + warmup
  • Batch size: [fill in effective batch size]
  • Epochs/Steps: 60 steps
  • Precision: bf16 mixed precision
  • Grad checkpointing: Enabled
  • Flash attention: Enabled (Unsloth optimization)

Training Hardware

  • GPU: RTX 4070 (12GB VRAM) [replace with actual if different]
  • Training runtime: 331.47 seconds
  • Steps per second: 0.181
  • Samples per second: 1.448
  • Final training loss: 0.4899
  • Total FLOPs: 3.8674e15 FLOPs
  • Checkpoint size: ~ (adapter size: ~200MB; merged model size depends on base LLaMA size)

Evaluation

Training Loss Curve

Training Loss Curve

Training Loss Table (per step)

Step Training Loss
1 0.9276
2 1.0030
3 1.0463
4 0.9592
5 0.9903
6 0.9239
7 0.7762
8 0.6905
9 0.6130
10 0.5687
11 0.6292
12 0.5927
13 0.5880
14 0.5021
15 0.5303
16 0.4216
17 0.4692
18 0.5330
19 0.4350
20 0.4003
21 0.3515
22 0.4201
23 0.4200
24 0.3666
25 0.4260
26 0.4261
27 0.3206
28 0.4385
29 0.3475
30 0.4438
31 0.4648
32 0.4088
33 0.4422
34 0.4209
35 0.3593
36 0.3433
37 0.3874
38 0.3604
39 0.4374
40 0.4048
41 0.3604
42 0.4087
43 0.3240
44 0.4375
45 0.4195
46 0.3881
47 0.4383
48 0.3506
49 0.4687
50 0.3709
51 0.3951
52 0.4012
53 0.4020
54 0.3977
55 0.2816
56 0.4136
57 0.4400
58 0.3268
59 0.4218
60 0.3629

Final averaged train loss: 0.4899


Environmental Impact (estimate)

  • Hardware: RTX 4070 (12GB VRAM) [replace with actual]
  • Hours: ~0.09 h (331 seconds)
  • Region/Provider: [cloud/on-prem]
  • Estimated CO₂e: Use ML CO₂ Impact

Citation

If you use this model, please cite the base model and this project:

BibTeX (base, example):

@article{touvron2023llama,
  title={LLaMA: Open and Efficient Foundation Language Models},
  author={Touvron, Hugo and others},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  year={2023}
}

Your work (fill in):

@misc{projgen2025,
  title = {ProjGen: Enhanced Developer Productivity for Flask Project Generation with a RAG-Enhanced Fine-Tuned Local LLM},
  author = {Sai Praneeth, Renduchinthala and UAB-NLP Group},
  year = {2025},
  howpublished = {\url{https://huggingface.co/UAB-NLP/ProjGen_Finetuned_llama}}
}

Contact

  • Author: Sai Praneeth Kumar (UAB, UAB-NLP Group)
  • HF Profile: UAB-NLP
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support