Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -1,19 +1,31 @@
|
|
1 |
---
|
2 |
title: README
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
colorTo: purple
|
6 |
sdk: static
|
7 |
-
pinned:
|
|
|
8 |
---
|
9 |
|
10 |
# Common Crawl
|
11 |
|
12 |
Welcome to the Common Crawl Foundation's Hugging Face page!
|
13 |
|
14 |
-
We
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: README
|
3 |
+
emoji: π
|
4 |
+
colorFrom: indigo
|
5 |
colorTo: purple
|
6 |
sdk: static
|
7 |
+
pinned: true
|
8 |
+
short_description: Explore Common Crawl's metadata and experimental datasets
|
9 |
---
|
10 |
|
11 |
# Common Crawl
|
12 |
|
13 |
Welcome to the Common Crawl Foundation's Hugging Face page!
|
14 |
|
15 |
+
We aim to provide metadata and experimental versions of our latest data products here.
|
16 |
|
17 |
+
### Useful Links
|
18 |
|
19 |
+
- [Common Crawl's official website](https://commoncrawl.org/)
|
20 |
+
- [Our existing statistics webpages](https://commoncrawl.github.io/cc-crawl-statistics/) ([GitHub repo](https://github.com/commoncrawl/cc-crawl-statistics))
|
21 |
+
- [AWS infrastructure status page](https://status.commoncrawl.org/)
|
22 |
+
|
23 |
+
### Datasets
|
24 |
+
|
25 |
+
Explore our datasets hosted on Hugging Face:
|
26 |
+
|
27 |
+
- [Common Crawl Statistics](https://huggingface.co/datasets/commoncrawl/statistics)
|
28 |
+
- [EOT 2024 Host-Level Logs](https://huggingface.co/datasets/commoncrawl/eot2024_hostlevel_logs)
|
29 |
+
- [Common Crawl Citations](https://huggingface.co/datasets/commoncrawl/citations)
|
30 |
+
|
31 |
+
We look forward to supporting the research and development community with these resources.
|