Text Generation
Transformers
Safetensors
llama
text-generation-inference
jpalomar commited on
Commit
4180298
·
verified ·
1 Parent(s): f6687a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -371,6 +371,8 @@ Feel free to click the expand button below to see the full list of sources.
371
  | The Swedish Culturomics Gigaword Corpus | sv | Rødven-Eide, 2016 |
372
  | Corpus of laws and legal acts of Ukraine | uk | [Link](https://lang.org.ua/en/corpora/#anchor7) |
373
 
 
 
374
  <details>
375
  <summary>References</summary>
376
 
@@ -565,7 +567,7 @@ especially if the content originates from less-regulated sources or user-generat
565
 
566
  This dataset is constituted by combining several sources, whose acquisition methods can be classified into three groups:
567
  - Web-sourced datasets with some preprocessing available under permissive license (p.e. Common Crawl).
568
- - Domain-specific or language-specific raw crawls (p.e. Spanish Crawling).
569
  - Manually curated data obtained through collaborators, data providers (by means of legal assignment agreements) or open source projects
570
  (p.e. CATalog).
571
 
 
371
  | The Swedish Culturomics Gigaword Corpus | sv | Rødven-Eide, 2016 |
372
  | Corpus of laws and legal acts of Ukraine | uk | [Link](https://lang.org.ua/en/corpora/#anchor7) |
373
 
374
+ To consult the data summary document with the respective licences, please send an e-mail to [email protected].
375
+
376
  <details>
377
  <summary>References</summary>
378
 
 
567
 
568
  This dataset is constituted by combining several sources, whose acquisition methods can be classified into three groups:
569
  - Web-sourced datasets with some preprocessing available under permissive license (p.e. Common Crawl).
570
+ - Domain-specific or language-specific raw crawls, always respecting robots.txt (p.e. Spanish Crawling).
571
  - Manually curated data obtained through collaborators, data providers (by means of legal assignment agreements) or open source projects
572
  (p.e. CATalog).
573