Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
babylm-seqlen 's Collections
Single Shuffled Data
Double Shuffled Data

Double Shuffled Data

updated Apr 8

Data shuffled at both the document-level, and again at the tokenized level

Upvote
-

  • babylm-seqlen/train_100M_256

    Viewer • Updated Apr 7 • 639k • 5

  • babylm-seqlen/train_100M_1024

    Viewer • Updated Apr 7 • 160k • 8

  • babylm-seqlen/train_100M_16384

    Viewer • Updated Apr 7 • 9.86k • 8

  • babylm-seqlen/train_100M_4096

    Viewer • Updated Apr 7 • 39.8k • 9

  • babylm-seqlen/train_100M_512

    Viewer • Updated Apr 7 • 319k

  • babylm-seqlen/train_100M_8192

    Viewer • Updated Apr 7 • 19.8k • 4

  • babylm-seqlen/train_100M_2048

    Viewer • Updated Apr 7 • 79.8k • 3

  • babylm-seqlen/train_100M_128

    Viewer • Updated Apr 8 • 1.28M • 1

  • babylm-seqlen/train_100M_64

    Viewer • Updated Apr 8 • 2.56M • 1
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs