Running on Zero 12 12 Mixture Of Diffusers SDXL Tiling ๐ Mixture of Diffusers implementation for XL Stable Diffusion
Running on Zero 1.88k 1.88k QR Code AI Art Generator ๐ฑ QR Code AI Art Generator Blend QR codes with AI Art
view post Post 6547 ๐ข New Research Alert: Making Language Models Smaller & Smarter!Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance. The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.๐ Key Findings:โข 77% parameter reduction.โข Maintained model capabilities.โข Improved generalization.Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORTCode: https://github.com/joaopauloschuler/less-parameters-llm See translation 1 reply ยท ๐ 18 18 ๐ฅ 8 8 ๐คฏ 3 3 ๐ 2 2 ๐ง 1 1 + Reply
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper โข 2502.02737 โข Published 7 days ago โข 154
view post Post 8281 hexgrad/Kokoro-82M got an upgrade! โฌ๏ธ More voices, more languages, pip install kokoro, and still 82M parameters.GitHub: https://github.com/hexgrad/kokoroPyPI: https://pypi.org/project/kokoro/Space: hexgrad/Kokoro-TTS See translation 11 replies ยท ๐ฅ 28 28 โค๏ธ 10 10 ๐ 10 10 ๐ 3 3 ๐ค 1 1 + Reply
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others โข Dec 23, 2024 โข 18
view post Post 2265 llama.cpp is 26.8% faster than ollama. I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison. Total duration: llama.cpp 6.85 sec <- 26.8% fasterollama 8.69 secBreakdown by phase:Model loadingllama.cpp 241 ms <- 2x fasterollama 553 msPrompt processingllama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x fasterollama 42.17 tokens/s with an eval time of 498 msToken generationllama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% fasterollama 122.07 tokens/s with an eval time 7.64 secllama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing. Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it. See translation 7 replies ยท ๐ 13 13 ๐ 7 7 + Reply