xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference Paper โข 2503.13427 โข Published 1 day ago โข 1
UniBERTs: Adversarial Training for Language-Universal Representations Paper โข 2503.12608 โข Published 2 days ago โข 1
Do Construction Distributions Shape Formal Language Learning In German BabyLMs? Paper โข 2503.11593 โข Published 4 days ago โข 1
view post Post 4377 We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all! See translation 3 replies ยท ๐ค 24 24 ๐ 7 7 โค๏ธ 6 6 ๐ง 1 1 ๐คฏ 1 1 + Reply
HyperZcdotZcdotW Operator Connects Slow-Fast Networks for Full Context Interaction Paper โข 2401.17948 โข Published Jan 31, 2024 โข 4
Modern Models, Medieval Texts: A POS Tagging Study of Old Occitan Paper โข 2503.07827 โข Published 8 days ago โข 1
EuroBERT: Scaling Multilingual Encoders for European Languages Paper โข 2503.05500 โข Published 11 days ago โข 72 โข 9
EuroBERT: Scaling Multilingual Encoders for European Languages Paper โข 2503.05500 โข Published 11 days ago โข 72 โข 9
view post Post 1942 -UPDATED-4bit inference is working! The blogpost is updated with code snippet and requirements.txthttps://devquasar.com/uncategorized/all-about-amd-and-rocm/-UPDATED-I've played around with an MI100 and ROCm and collected my experience in a blogpost:https://devquasar.com/uncategorized/all-about-amd-and-rocm/Unfortunately I've could not make inference or training work with model loaded in 8bit or use BnB, but did everything else and documented my findings. See translation 4 replies ยท ๐ 4 4 ๐ฅ 2 2 ๐ 1 1 ๐ 1 1 ๐ค 1 1 โค๏ธ 1 1 ๐ 1 1 โ 1 1 ๐ง 1 1 ๐ค 1 1 ๐ 1 1 ๐คฏ 1 1 + Reply
view post Post 866 ๐น๐ท ๐ I'm very happy to finally announce my new Turkish LM called "BERT5urk": stefan-it/bert5urkIt is a 1.42B T5-based model, trained with UL2 pretraining objective on the Turkish part of the awesome HuggingFaceFW/fineweb-2 dataset.Feel free to check it out! See translation 1 reply ยท ๐ 1 1 + Reply