Motoki Wu's picture

Motoki Wu

tokestermw

·

https://motoki.co

AI & ML interests

None yet

Recent Activity

upvoted a paper about 18 hours ago

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

liked a model 1 day ago

mistralai/Mistral-Small-3.1-24B-Base-2503

liked a model 2 days ago

canopylabs/orpheus-3b-0.1-ft

View all activity

Organizations

tokestermw's activity

upvoted a paper about 18 hours ago

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published 2 days ago • 47

upvoted a paper 3 days ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published 4 days ago • 89

upvoted a collection 10 days ago

Gemma 3 Release

9 items • Updated 9 days ago • 285

upvoted an article 10 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

11 days ago

• 332

upvoted a collection 16 days ago

Light-R1

Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated 9 days ago • 11

upvoted a collection 18 days ago

Hallucination detection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated 17 days ago • 15

upvoted a paper 22 days ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published 25 days ago • 25

upvoted a paper 24 days ago

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published 26 days ago • 28

upvoted a paper 25 days ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published 25 days ago • 70

upvoted 3 papers 26 days ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 126

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Paper • 2502.15027 • Published 30 days ago • 7

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Paper • 2502.14922 • Published about 1 month ago • 30

upvoted a collection 27 days ago

Sky-T1-7B

A series of 7B models trained with different recipes and the corresponding training data. • 8 items • Updated Feb 14 • 6

upvoted a collection about 1 month ago

Process Reward Models

Model and Datasets for Qwen 2.5 Math PRM 7B • 6 items • Updated Feb 18 • 2

upvoted 5 papers about 1 month ago

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published Feb 14 • 32

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 46

Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 22

ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning

Paper • 2502.04689 • Published Feb 7 • 7

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Paper • 2502.04404 • Published Feb 6 • 24