Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
crossroderick
/
dalat5
like
0
Text Generation
Safetensors
Kazakh
doi:10.57967/hf/5255
t5
text2text-generation
transliteration
kazakh
low-resource
cultural-nlp
Eval Results
License:
mit
Model card
Files
Files and versions
xet
Community
main
dalat5
/
src
/
data
/
generate_clean_corpus.sh
crossroderick
Pre-v4 readme and support files update
252a85f
5 months ago
raw
Copy download link
history
blame
contribute
delete
Safe
116 Bytes
shuf
kazakh_latin_corpus.jsonl -o kazakh_latin_corpus.jsonl
grep
'\S'
kazakh_latin_corpus.jsonl > clean_corpus.jsonl