SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published Feb 4 β’ 203
Towards Best Practices for Open Datasets for LLM Training Paper β’ 2501.08365 β’ Published Jan 14 β’ 56
view post Post 6027 Everchanging Quest is out !It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.Come test it on the space : Jofthomas/Everchanging-Quest 2 replies Β· π₯ 24 24 π 11 11 π 3 3 π§ 1 1 β€οΈ 1 1 π 1 1 π€― 1 1 π€ 1 1 + Reply
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper β’ 2406.17557 β’ Published Jun 25, 2024 β’ 93
A Dataset and Strong Baselines for Classification of Czech News Texts Paper β’ 2307.10666 β’ Published Jul 20, 2023