--- title: README emoji: 🌍 colorFrom: indigo colorTo: purple sdk: static pinned: true short_description: Explore Common Crawl's metadata and experimental datasets --- # Common Crawl Welcome to the Common Crawl Foundation's Hugging Face page! We aim to provide metadata and experimental versions of our latest data products here. ### Useful Links - [Common Crawl's official website](https://commoncrawl.org/) - [Our existing statistics webpages](https://commoncrawl.github.io/cc-crawl-statistics/) ([GitHub repo](https://github.com/commoncrawl/cc-crawl-statistics)) - [AWS infrastructure status page](https://status.commoncrawl.org/) ### Datasets Explore our datasets hosted on Hugging Face: - [Common Crawl Citations](https://huggingface.co/datasets/commoncrawl/citations) - [Common Crawl Citations, Annotated](https://huggingface.co/datasets/commoncrawl/citations-annotated) - [Common Crawl Statistics](https://huggingface.co/datasets/commoncrawl/statistics) - [EOT 2024 Host-Level Logs](https://huggingface.co/datasets/commoncrawl/eot2024_hostlevel_logs) (only available to EOT collaborators) We look forward to supporting the research and development community with these resources.