workaround so training doesn't hang when packed dataloader batches aren't even (#461) c69faee unverified winglian commited on Aug 23, 2023
Attention mask and position id fixes for packing (#285) 2bb0b78 unverified winglian commited on Aug 12, 2023