2 12 22

ingeolbaek

ingeol

bakingeol

AI & ML interests

None yet

Recent Activity

upvoted a collection 3 days ago

VisionLM

upvoted a collection 3 days ago

Deepseek Papers

upvoted a paper 2 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

View all activity

Organizations

None yet

Collections 1

Papers 2

arxiv:2410.13339

arxiv:2407.12529

models 104

datasets 3

ingeol/trivia_qa_train_sample

Viewer • Updated Aug 30, 2024 • 500 • 8

ingeol/test_df

Viewer • Updated Aug 13, 2024 • 534k • 5

ingeol/apollo-openweb-tokenizer-gpt2-to-llama2-2024

Viewer • Updated Jul 31, 2024 • 5.06M • 129

ingeolbaek

AI & ML interests

Recent Activity

Organizations

Collections 1

Secrets of RLHF in Large Language Models Part I: PPO

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

Secrets of RLHF in Large Language Models Part I: PPO

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

Papers 2

models 104

ingeol/qwen8b-deepl-coding_test_ckpt_qlora

ingeol/qa_ep5_bs_1

ingeol/passage5_cite_idk_005

ingeol/iipl_data_citation_v1

ingeol/iipl_data_83_0.1

ingeol/iipl_data_7000_ep5_bs1_accu_5

ingeol/iipl_data_10

ingeol/qa_ep5_deepspeed_acc5

ingeol/qa_ep5_bat22_acc5

ingeol/qa_ep5_bat2_acc5

datasets 3

ingeol/trivia_qa_train_sample

ingeol/test_df

ingeol/apollo-openweb-tokenizer-gpt2-to-llama2-2024

ingeolbaek

AI & ML interests

Recent Activity

Organizations

Collections 1

Papers 2

models 104 Sort: Recently updated

datasets 3 Sort: Recently updated

models 104

datasets 3