13 204 879

Reza Sayar PRO

Reza2kn

AI & ML interests

None yet

Recent Activity

upvoted an article about 9 hours ago

KV Caching Explained: Optimizing Transformer Inference Efficiency

liked a model about 11 hours ago

AtlaAI/Selene-1-Mini-Llama-3.1-8B-Q4_K_M-GGUF

liked a model about 11 hours ago

AtlaAI/Selene-1-Mini-Llama-3.1-8B

View all activity

Organizations

Reza2kn's activity

upvoted an article about 9 hours ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

1 day ago

• 19

liked 2 models about 11 hours ago

AtlaAI/Selene-1-Mini-Llama-3.1-8B-Q4_K_M-GGUF

Text Generation • Updated about 17 hours ago • 114 • 7

AtlaAI/Selene-1-Mini-Llama-3.1-8B

Text Generation • Updated about 10 hours ago • 351 • 44

upvoted 2 papers about 11 hours ago

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published 4 days ago • 25

TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published 3 days ago • 4

liked a model about 11 hours ago

mlx-community/Mistral-Small-24B-Instruct-2501-4bit

Updated about 13 hours ago • 4

liked a dataset about 11 hours ago

cognitivecomputations/dolphin-r1

Viewer • Updated about 10 hours ago • 814k • 20 • 72

liked a model about 11 hours ago

mistralai/Mistral-Small-24B-Base-2501

Text Generation • Updated about 11 hours ago • 131

liked a model 1 day ago

SOTAMak1r/GVGEN

Updated Jul 4, 2024 • 1

upvoted a paper 1 day ago

OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Paper • 2501.15427 • Published 5 days ago • 4

liked a Space 1 day ago

Runtime error

465

🚀

Wonder3D

liked a dataset 1 day ago

3DAIGC/gobjaverse

Updated Jan 17, 2024 • 188 • 7

liked a model 1 day ago

chenguolin/DiffSplat

Updated 2 days ago • 7

upvoted 3 papers 1 day ago

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Paper • 2501.16764 • Published 3 days ago • 14

Open Problems in Mechanistic Interpretability

Paper • 2501.16496 • Published 3 days ago • 11

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 2 days ago • 48

liked 2 datasets 1 day ago

tianzhechu/SFTvsRL_Data

Viewer • Updated 1 day ago • 5.97k • 2 • 1

xywang1/OpenCharacter

Viewer • Updated 2 days ago • 326k • 21 • 5

upvoted a paper 1 day ago

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation

Paper • 2501.15907 • Published 4 days ago • 14

liked a model 1 day ago

amphion/Vevo

Text-to-Speech • Updated 8 days ago • 35 • 22