Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273) 8430db2 unverified jinwonkim93 commited on Feb 13, 2024
Cosine learning rate schedule - minimum learning rate (#1062) 04b978b unverified ricdomolm winglian commited on Jan 9, 2024
support for multi line inference input, log sweep over learning rates 9105935 winglian commited on May 3, 2023