Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
6
6
70
Abror Shopulatov
PRO
murodbek
Follow
21world's profile picture
elzodxon's profile picture
sarahai's profile picture
13 followers
·
7 following
https://murodbek.substack.com/
murodbeck
shopulatov
AI & ML interests
Machine Learning, NLP, Grammatical Error Correction
Recent Activity
reacted
to
hexgrad
's
post
with 🔥
4 days ago
Wanted: Peak Data. I'm collecting audio data to train another TTS model: + AVM data: ChatGPT Advanced Voice Mode audio & text from source + Professional audio: Permissive (CC0, Apache, MIT, CC-BY) This audio should *impress* most native speakers, not just barely pass their audio Turing tests. Professional-caliber means S or A-tier, not your average bloke off the street. Traditional TTS may not make the cut. Absolutely no low-fi microphone recordings like Common Voice. The bar is much higher than last time, so there are no timelines yet and I expect it may take longer to collect such mythical data. Raising the bar means evicting quite a bit of old data, and voice/language availability may decrease. The theme is *quality* over quantity. I would rather have 1 hour of A/S-tier than 100 hours of mid data. I have nothing to offer but the north star of a future Apache 2.0 TTS model, so prefer data that you *already have* and costs you *nothing extra* to send. Additionally, *all* the new data may be used to construct public, Apache 2.0 voicepacks, and if that arrangement doesn't work for you, no need to send any audio. Last time I asked for horses; now I'm asking for unicorns. As of writing this post, I've currently got a few English & Chinese unicorns, but there is plenty of room in the stable. Find me over on Discord at `rzvzn`: https://discord.gg/QuGxSWBfQy
liked
a dataset
11 days ago
DavronSherbaev/uzbekvoice-filtered
liked
a model
11 days ago
deepseek-ai/DeepSeek-R1
View all activity
Organizations
Papers
1
arxiv:
2409.04269
spaces
1
pinned
Sleeping
Mmlu Lite Uz
✍
Review of MMLU-Lite-uz
models
3
Sort: Recently updated
murodbek/uzroberta-panx-uz
Token Classification
•
Updated
Aug 9, 2023
•
170
murodbek/xlm-roberta-panx-uz
Token Classification
•
Updated
Apr 13, 2023
•
114
murodbek/uzroberta-sentiment-analysis
Text Classification
•
Updated
Apr 11, 2023
•
29
datasets
5
Sort: Recently updated
murodbek/uzlib
Viewer
•
Updated
16 days ago
•
1.86k
•
94
murodbek/uzbek-speech-corpus
Viewer
•
Updated
19 days ago
•
108k
•
113
•
1
murodbek/Global-MMLU-uz
Viewer
•
Updated
Dec 21, 2024
•
14.3k
•
49
murodbek/Global-MMLU-Lite-uz
Viewer
•
Updated
Dec 15, 2024
•
685
•
50
murodbek/uz-text-classification
Viewer
•
Updated
Oct 31, 2023
•
513k
•
106
•
2