|
Tiny Llama Project Guide: Running TinyLlama-1.1B-Chat-v1.0 Locally
|
|
|
|
This document provides a step-by-step guide to run the TinyLlama-1.1B-Chat-v1.0 model locally on a laptop with 16GB RAM, i5 processor, and Windows OS. The guide includes setting up the environment, downloading the model, fine-tuning, and running a Flask-based chat UI.
|
|
|
|
---
|
|
|
|
System Requirements
|
|
Operating System: Windows
|
|
RAM: 16GB
|
|
Processor: Intel i5 or equivalent
|
|
Python Version: 3.10.9
|
|
- IDE: Visual Studio Code (VS Code)
|
|
- Internet: Required for downloading model and libraries
|
|
|
|
---
|
|
|
|
Step-by-Step Setup
|
|
|
|
1. Install Python 3.10.9
|
|
- Download and install Python 3.10.9 from https://www.python.org/downloads/release/python-3109/.
|
|
- Ensure Python and pip are added to your system PATH.
|
|
|
|
2. Set Up a Virtual Environment
|
|
- Open VS Code terminal in your project directory (e.g., C:\path\to\TinyLlama-1.1B).
|
|
- Run:
|
|
```
|
|
python -m venv venv
|
|
.\venv\Scripts\activate
|
|
```
|
|
|
|
3. Install Required Libraries
|
|
- In the activated virtual environment, run:
|
|
```
|
|
pip install transformers torch huggingface_hub datasets peft trl accelerate flask matplotlib
|
|
```
|
|
- This installs libraries for model handling, fine-tuning, Flask app, and plotting.
|
|
|
|
4. Download the TinyLlama Model
|
|
- Create a file `download_model.py` with the following code:
|
|
```python
|
|
from huggingface_hub import login, snapshot_download
|
|
login(token="YOUR_ACCESS_TOKEN_HERE")
|
|
snapshot_download(repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0", local_dir="./tinyllama_model")
|
|
```
|
|
- Replace `YOUR_ACCESS_TOKEN_HERE` with your Hugging Face access token (get it from https://huggingface.co/settings/tokens).
|
|
- Run: `python download_model.py`
|
|
- Model weights will be saved in the `tinyllama_model` folder.
|
|
|
|
5. Run Inference with Flask UI
|
|
- Create a `finetune` folder in your project directory.
|
|
- Copy `app.py` and `templates/index.html` from the repository to the `finetune` folder.
|
|
- Run: `python app.py`
|
|
- Open http://127.0.0.1:5000 in your browser to access the chat UI.
|
|
- Enter prompts to interact with the model.
|
|
|
|
6. Fine-Tune the Model (Optional)
|
|
- In the `finetune` folder, ensure `dataset.json` and `finetune.py` are present.
|
|
- Run: `python finetune.py`
|
|
- Fine-tuned weights will be saved in `finetune/finetuned_weights`.
|
|
- Update `app.py` to point to `./finetuned_weights` for inference with the fine-tuned model.
|
|
- Check `loss_plot.png` for training loss visualization.
|
|
|
|
7. View Training Metrics
|
|
- After fine-tuning, check the console for final train loss and learning rate.
|
|
- Open `loss_plot.png` in the `finetune` folder for a graphical view of training loss.
|
|
|
|
---
|
|
|
|
Project Structure
|
|
- `tinyllama_model/`: Model weights downloaded from Hugging Face.
|
|
- `finetune/`: Contains fine-tuning scripts and fine-tuned weights.
|
|
- `dataset.json`: Small dataset for fine-tuning.
|
|
- `finetune.py`: Fine-tuning script with LoRA.
|
|
- `app.py`: Flask app for inference.
|
|
- `templates/index.html`: Chat UI.
|
|
- `loss_plot.png`: Training loss plot.
|
|
- `requirements.txt`: List of required libraries.
|
|
- `document.txt`: This guide.
|
|
- `README.md`: Project overview.
|
|
|
|
---
|
|
|
|
Attribution
|
|
- **Model**: TinyLlama-1.1B-Chat-v1.0
|
|
- **Source**: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
|
|
- **Organization**: TinyLlama
|
|
- **License**: Check the model's Hugging Face page for licensing details.
|
|
|
|
---
|
|
|
|
Notes
|
|
- Model weights are not included in this repository to respect licensing terms.
|
|
- Download the model directly from Hugging Face using your access token.
|
|
- Ensure sufficient disk space (~2-3GB) for model weights and fine-tuned weights.
|
|
- For support, refer to the TinyLlama Hugging Face page or community forums. |