Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo Paper • 2503.09799 • Published 2 days ago • 10