-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Value
Feature Extraction • 8B • Updated • 2 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Policy
Feature Extraction • 8B • Updated -
Waterhorse/Llama-3.1-8B-Instruct-NLRL-Breakthrough-Value
Feature Extraction • 8B • Updated • 3
Bo Liu
Benjamin-eecs
AI & ML interests
Reinforcement Learning, Reasoning, Machine Learning Systems
Recent Activity
authored
a paper
2 days ago
SPICE: Self-Play In Corpus Environments Improves Reasoning
upvoted
a
paper
3 days ago
SPICE: Self-Play In Corpus Environments Improves Reasoning
authored
a paper
19 days ago
BigCodeArena: Unveiling More Reliable Human Preferences in Code
Generation via Execution