Jithendra-k
/

interACT_LLM

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Jithendra-k commited on Apr 15, 2024

Commit

a2eaa61

·

verified ·

1 Parent(s): 1a83a32

Update README.md

Files changed (1) hide show

README.md +13 -6

README.md CHANGED Viewed

@@ -5,11 +5,18 @@ This model is a part of Project InterACT (Multi model AI system) involving an ob
 This is a model built by finetuning the Llama-2-7b-chat model on custom dataset: Jithendra-k/InterACT_LLM.
-Points to consider for Finetuning Llama-2_7B_chat model:
-=> Free Google Colab offers a 15GB Graphics Card (Limited Resources --> Barely enough to store Llama 2–7b’s weights)
-=> We also considered the overhead due to optimizer states, gradients, and forward activations
-=> Full fine-tuning is not possible in our case due to computation: we used parameter-efficient fine-tuning (PEFT) techniques like LoRA or QLoRA.
-=> To drastically reduce the VRAM usage, we fine-tuned the model in 4-bit precision, which is why we've used QLoRA technique.
-=> We only trained with 5 epochs considering our computation, time and early stopping.
 Code to finetune a Llama-2_7B_chat model: https://colab.research.google.com/drive/1ZTdSKu2mgvQ1uNs0Wl7T7gniuoZJWs24?usp=sharing

 This is a model built by finetuning the Llama-2-7b-chat model on custom dataset: Jithendra-k/InterACT_LLM.
+Points to consider for Finetuning Llama-2_7B_chat model:<br>
+=> Free Google Colab offers a 15GB Graphics Card (Limited Resources --> Barely enough to store Llama 2–7b’s weights)<br>
+=> We also considered the overhead due to optimizer states, gradients, and forward activations<br>
+=> Full fine-tuning is not possible in our case due to computation: we used parameter-efficient fine-tuning (PEFT) techniques like LoRA or QLoRA.<br>
+=> To drastically reduce the VRAM usage, we fine-tuned the model in 4-bit precision, which is why we've used QLoRA technique.<br>
+=> We only trained with 5 epochs considering our computation, time and early stopping.<br>
+Here are some plots of model performance during training:<br>
+Here is an Example Input/Output:<br>
+<img src="https://drive.google.com/file/d/1E0z3MAlJXu05bc8E9yDID0CVEbhowuca/view?usp=sharing"><br>
 Code to finetune a Llama-2_7B_chat model: https://colab.research.google.com/drive/1ZTdSKu2mgvQ1uNs0Wl7T7gniuoZJWs24?usp=sharing