Stefan Schweter's picture

Stefan Schweter PRO

stefan-it

AI & ML interests

Flair Library ๐Ÿ’•, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models

Recent Activity

Organizations

Bayerische Staatsbibliothek's profile picture flair's profile picture Flax Community's profile picture dumitrescustefan-org's profile picture GermanT5's profile picture BigScience: LMs for Historical Texts's profile picture BigLAM: BigScience Libraries, Archives and Museums's profile picture Universal NER's profile picture Libre Euro Lingua-Alliance's profile picture Lang UK's profile picture BabyLM Challenge's profile picture hmByT5's profile picture hmByT5 Preliminary's profile picture Blog-explorers's profile picture German Wikipedia LMs's profile picture hmBERT's profile picture hmTEAMS's profile picture HIPE's profile picture hmBERT Tiny's profile picture hmBERT 64k's profile picture LSV @ Saarland University's profile picture GERMATRON's profile picture PleIAs's profile picture German LLM Tokenizers's profile picture Social Post Explorers's profile picture Occiglot's profile picture GERTuraX's profile picture Stefmal's profile picture Hugging Face Discord Community's profile picture ScaDS.AI German LLM's profile picture ENGEBA's profile picture Nerdy Face's profile picture TensorFlow Model Garden LMs's profile picture

stefan-it's activity

reacted to clem's post with ๐Ÿš€ 4 days ago
view post
Post
4377
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
ยท
New activity in stefan-it/neobert-ner-conll03 12 days ago

Update model.py

3
#1 opened 15 days ago by
KoichiYasuoka
reacted to csabakecskemeti's post with ๐Ÿ”ฅ 15 days ago
view post
Post
1942
-UPDATED-
4bit inference is working! The blogpost is updated with code snippet and requirements.txt
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
-UPDATED-
I've played around with an MI100 and ROCm and collected my experience in a blogpost:
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
Unfortunately I've could not make inference or training work with model loaded in 8bit or use BnB, but did everything else and documented my findings.
  • 4 replies
ยท
replied to their post 15 days ago
view reply

Let's see if BERT5urk can make it into @merve 's weekly recap of open AI ๐Ÿค—

posted an update 15 days ago
view post
Post
866
๐Ÿ‡น๐Ÿ‡ท ๐Ÿ˜ I'm very happy to finally announce my new Turkish LM called "BERT5urk":

stefan-it/bert5urk

It is a 1.42B T5-based model, trained with UL2 pretraining objective on the Turkish part of the awesome HuggingFaceFW/fineweb-2 dataset.

Feel free to check it out!
  • 1 reply
ยท
New activity in stefan-it/bert5urk 15 days ago