Tiny-GPT2 Text Generation Project This repository provides resources to run and fine-tune the sshleifer/tiny-gpt2 model locally on a CPU, suitable for laptops with 8GB or 16GB RAM. The goal is to enable students to learn about AI model workings, experiment, and conduct research. Prerequisites
Python: Version 3.10.9 recommended (3.9.10 also works). Hardware: Minimum 8GB RAM, CPU-only (GPU optional but not required). Hugging Face Account: Required for downloading model weights (create at huggingface.co).
Setup Instructions
Create a Virtual Environment: python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
Install Libraries: pip install torch==2.3.0 transformers==4.38.2 huggingface_hub==0.22.2 datasets==2.21.0 numpy==1.26.4
Download Model Weights:
Copy download_model.py from the repository to your project folder. Replace YOUR_HUGGINGFACE_API_TOKEN with your Hugging Face token (from huggingface.co/settings/tokens). Run:python download_model.py
Test the Model:
Copy test_model.py to your project folder. Run:python test_model.py
Expected output: Generated text starting with "Once upon a time".
Fine-Tune the Model:
Navigate to the fine_tune folder. Add your dataset as sample_data.txt (or use the provided example). Run:python fine_tune_model.py
The fine-tuned model will be saved in fine_tuned_model.
Notes for GPU Users
The scripts are configured to run on CPU (CUDA_VISIBLE_DEVICES="" in fine_tune_model.py). To use a GPU (if available), remove os.environ["CUDA_VISIBLE_DEVICES"] = "" and no_cuda=True from fine_tune_model.py. Ensure your PyTorch installation supports CUDA (run pip install torch==2.3.0+cu121 for GPU support).
Troubleshooting
Memory Issues: If you have 8GB RAM, ensure no other heavy applications are running. Library Conflicts: Use the exact versions listed above to avoid compatibility issues. File Not Found: Verify the model files are in tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be.
Model tree for remiai3/sshleifer_tiny-gpt2_project_guide
Base model
sshleifer/tiny-gpt2