Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Chenyan Xiong Research Group at CMU

university
https://www.cs.cmu.edu/~cx/
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

SingularityHJY  updated a dataset about 1 month ago
cx-cmu/ClueWeb-Reco
yuzc19  updated a dataset about 1 month ago
cx-cmu/repro-organic-data-72B
yuzc19  updated a collection about 1 month ago
RePro
View all activity

Papers

RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

View all Papers

Chenyan Xiong's profile picture Cassandra Cohen's profile picture  Zichun Yu's profile picture Jingyuan He's profile picture Mahima Jagadeesh Patel's profile picture zhihan zhang's profile picture Kira Jones's profile picture

cx-cmu 's collections 1

RePro
Space for RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
  • cx-cmu/repro-rephraser-4B

    Text Generation • 196k • Updated Oct 18 • 27 • 1
  • cx-cmu/repro-rl-data

    Viewer • Updated Oct 18 • 41k • 44
  • RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

    Paper • 2510.10681 • Published Oct 12 • 5
  • cx-cmu/repro-rephrased-data-72B

    Viewer • Updated Oct 18 • 39M • 921
RePro
Space for RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
  • cx-cmu/repro-rephraser-4B

    Text Generation • 196k • Updated Oct 18 • 27 • 1
  • cx-cmu/repro-rl-data

    Viewer • Updated Oct 18 • 41k • 44
  • RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

    Paper • 2510.10681 • Published Oct 12 • 5
  • cx-cmu/repro-rephrased-data-72B

    Viewer • Updated Oct 18 • 39M • 921
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs