haiy bai's picture

13 7

haiy bai

Warsun

AI & ML interests

AGI

Recent Activity

upvoted a paper 26 days ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

liked a model 26 days ago

BAAI/bge-m3

liked a model 26 days ago

simplescaling/s1-32B

View all activity

Organizations

None yet

Warsun's activity

upvoted a paper 26 days ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published 29 days ago • 184

liked 7 models 26 days ago

BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 2.91M • • 1.84k

simplescaling/s1-32B

Text Generation • Updated 16 days ago • 14.3k • 288

Alpha-VLLM/Lumina-Image-2.0

Text-to-Image • Updated about 7 hours ago • 37.7k • • 282

ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4

Reinforcement Learning • Updated 29 days ago • 85.9k • 763

nomic-ai/nomic-embed-text-v2-moe

Sentence Similarity • Updated 3 days ago • 172k • 296

microsoft/OmniParser-v2.0

Image-Text-to-Text • Updated 24 days ago • 9.58k • 1.16k

Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated 27 days ago • 58.4k • 1.04k

upvoted 12 papers 26 days ago

TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published Jan 28 • 5

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

Paper • 2501.15747 • Published Jan 27 • 7

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Paper • 2501.16764 • Published Jan 28 • 22

Open Problems in Mechanistic Interpretability

Paper • 2501.16496 • Published Jan 27 • 19

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Paper • 2501.16372 • Published Jan 23 • 9

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published Jan 28 • 26

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

Paper • 2501.17433 • Published Jan 29 • 9

Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks

Paper • 2501.15891 • Published Jan 27 • 14

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Paper • 2501.17749 • Published Jan 29 • 13

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

Paper • 2501.15654 • Published Jan 26 • 13

Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts

Paper • 2501.14334 • Published Jan 24 • 20

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27 • 33