VGS

company

https://github.com/kaiwenw/value-guided-search

AI & ML interests

None defined yet.

Recent Activity

jpzhou01 authored a paper about 1 month ago

Value-Guided Search for Efficient Chain-of-Thought Reasoning

kaiwenw updated a dataset 2 months ago

VGS-AI/OpenR1-VM

kaiwenw new activity 3 months ago

VGS-AI/OpenR1-Cleaned:Add link to GitHub repository and example usage

View all activity

jpzhou01

authored a paper about 1 month ago

Value-Guided Search for Efficient Chain-of-Thought Reasoning

Paper • 2505.17373 • Published May 23 • 5

kaiwenw

updated a dataset 2 months ago

VGS-AI/OpenR1-VM

Viewer • Updated Jun 25 • 44.5k • 269

kaiwenw

in VGS-AI/OpenR1-Cleaned 3 months ago

Add link to GitHub repository and example usage

#2 opened 3 months ago by

kaiwenw

in VGS-AI/OpenR1-VM 3 months ago

Add link to codebase

#3 opened 3 months ago by

kaiwenw

in VGS-AI/DeepSeek-VM-1.5B 3 months ago

Improve model card for Value-Guided Search

#2 opened 3 months ago by

Improve model card: add pipeline tag, link to paper and code

#1 opened 3 months ago by

kaiwenw

in VGS-AI/OpenR1-VM 3 months ago

Add paper link, task category and license

#2 opened 3 months ago by

kaiwenw

in VGS-AI/OpenR1-Cleaned 3 months ago

Add paper link, task category, and related resources

#1 opened 3 months ago by

kaiwenw

updated a model 3 months ago

VGS-AI/DeepSeek-VM-1.5B

Text Generation • 2B • Updated Jun 4 • 496

kaiwenw

published a model 3 months ago

VGS-AI/DeepSeek-VM-1.5B

Text Generation • 2B • Updated Jun 4 • 496

kaiwenw

published a dataset 3 months ago

VGS-AI/OpenR1-VM

Viewer • Updated Jun 25 • 44.5k • 269

kaiwenw

updated a dataset 3 months ago

VGS-AI/OpenR1-Cleaned

Viewer • Updated Jun 5 • 49.4k • 63

kaiwenw

published a dataset 3 months ago

VGS-AI/OpenR1-Cleaned

Viewer • Updated Jun 5 • 49.4k • 63

kaiwenw

authored 3 papers about 1 year ago

Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

Paper • 2302.03201 • Published Feb 7, 2023

Switching the Loss Reduces the Cost in Batch Reinforcement Learning

Paper • 2403.05385 • Published Mar 8, 2024

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Paper • 2407.15762 • Published Jul 22, 2024 • 10

jpzhou01

authored a paper about 2 years ago

Unsupervised Out-of-Distribution Detection with Diffusion Inpainting

Paper • 2302.10326 • Published Feb 20, 2023