Kristaller486's picture

Kristaller486

kristaller486

·

AI & ML interests

NLP, Machine Translation

Recent Activity

liked a dataset about 7 hours ago

HuggingFaceH4/no_robots

liked a model 1 day ago

secretmoon/YankaGPT-8B-v0.1

liked a dataset 2 days ago

Aniemore/resd_annotated

View all activity

Organizations

kristaller486's activity

upvoted a collection 3 days ago

Gemma 3 INT4

These are converted from the official QAT INT4 Flax checkpoints on Kaggle. Supported formats: AutoAWQ, GGUF • 12 items • Updated 3 days ago • 2

upvoted a paper 7 days ago

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 3

upvoted a paper 29 days ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3 • 35

upvoted 2 collections about 1 month ago

Slam

All resources for SpeechLMs from "Slamming: Training a Speech Language Model on One GPU in a Day". We provide tokeniser, lm, and datasets • 6 items • Updated Feb 25 • 13

RuModernBERT

Modernized BERT for Russian • 2 items • Updated Feb 19 • 4

upvoted a paper about 2 months ago

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published Feb 10 • 89

upvoted a collection about 2 months ago

Llasa

TTS foundation model compatible with Llama framework (160k hours tokenized speech data released) • 11 items • Updated Feb 21 • 16

upvoted a paper about 2 months ago

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5 • 60

upvoted an article 2 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 834

upvoted a collection 2 months ago

EvaByte

3 items • Updated Jan 21 • 3

upvoted a paper 3 months ago

Facilitating large language model Russian adaptation with Learned Embedding Propagation

Paper • 2412.21140 • Published Dec 30, 2024 • 18

upvoted a collection 3 months ago

DeepSeek-V3

4 items • Updated 9 days ago • 231

upvoted a collection 4 months ago

FineWeb2 Collaborative Annotation Sprint

5 items • Updated Dec 24, 2024 • 7

upvoted 2 papers 4 months ago

Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis

Paper • 2412.01819 • Published Dec 2, 2024 • 35

Multi-Granularity Prediction for Scene Text Recognition

Paper • 2209.03592 • Published Sep 8, 2022 • 2

upvoted a collection 5 months ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 300

upvoted 2 papers 5 months ago

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 18

Language Models can Self-Lengthen to Generate Long Texts

Paper • 2410.23933 • Published Oct 31, 2024 • 18

upvoted a collection 5 months ago

DocLayout-YOLO

Dataset and model for DocLayout-YOLO • 10 items • Updated Jan 14 • 16

upvoted a collection 7 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Feb 26 • 579