Cicciokr commited on
Commit
ecec181
·
verified ·
1 Parent(s): f614dbe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -6,8 +6,7 @@ This model is fine tuned with:
6
  - Perseus Project - 15M Token
7
 
8
  The dataset was cleaned:
9
-
10
- Removal of all "pseudo-Latin" text ("Lorem ipsum ...").
11
- Use of CLTK for sentence splitting and normalisation.
12
- deduplication of the corpus
13
- lowercase all text
 
6
  - Perseus Project - 15M Token
7
 
8
  The dataset was cleaned:
9
+ - Removal of all "pseudo-Latin" text ("Lorem ipsum ...").
10
+ - Use of CLTK for sentence splitting and normalisation.
11
+ - deduplication of the corpus
12
+ - lowercase all text