ahmedselhady 's Collections

The role of English Data in Continued Pretraining

Models trained using exponential moving average of weights (EMA), work done in Paper (todo: link the paper)