PTvsSFT_OLMo1b / README.md
KaiserWhoLearns's picture
Update README.md
dd72e22 verified

This is the model checkpoint release for Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models.

All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}. To load a specific model checkpoint, use the following command.

model = AutoModelForCausalLM.from_pretrained(
                model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b",
                trust_remote_code=trust_remote_code,
                revision="your revision"
            )

All the checkpoints are fine-tuned based on the checkpoints of OLMo1b-HF.

Citation:

@article{sun2024amuro,
  title={Amuro \& char: Analyzing the relationship between pre-training and fine-tuning of large language models},
  author={Sun, Kaiser and Dredze, Mark},
  journal={arXiv preprint arXiv:2408.06663},
  year={2024}
}

license: apache-2.0