Peiyong Wang's picture

56 14

Peiyong Wang PRO

Addwater

·

AI & ML interests

Quantum Computing, AI

Recent Activity

upvoted a paper 1 day ago

rStar2-Agent: Agentic Reasoning Technical Report

upvoted a paper 9 days ago

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

upvoted a paper 14 days ago

Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model

View all activity

Organizations

None yet

upvoted a paper 1 day ago

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published 5 days ago • 81

upvoted a paper 9 days ago

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published 11 days ago • 122

upvoted a paper 14 days ago

Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model

Paper • 2508.13009 • Published 15 days ago • 22

upvoted a paper 15 days ago

Next Visual Granularity Generation

Paper • 2508.12811 • Published 15 days ago • 47

upvoted 4 papers 21 days ago

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

Paper • 2508.09138 • Published 21 days ago • 36

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation

Paper • 2508.06426 • Published 25 days ago • 10

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published 22 days ago • 43

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 23 days ago • 88

upvoted 2 papers 23 days ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published 27 days ago • 123

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published 26 days ago • 168

upvoted a paper 29 days ago

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published Aug 1 • 62

upvoted 9 papers about 1 month ago

On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective

Paper • 2507.23632 • Published Jul 31 • 6

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 124

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 147

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published Jul 28 • 31

Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21 • 61

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 294

Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21 • 68

Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory

Paper • 2507.16713 • Published Jul 22 • 21

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22 • 62