bikalnetomi
·
AI & ML interests
None yet
Organizations
bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.4
Text Generation
•
1B
•
Updated
•
3
•
bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.3
Text Generation
•
1B
•
Updated
•
3
•
bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.1
Text Generation
•
1B
•
Updated
•
5
•
bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.0
Text Generation
•
1B
•
Updated
•
4
•
bikalnetomi/RLHF-PPO-RewardModel-LLama3-1B-v1
Text Classification
•
1B
•
Updated
•
1
bikalnetomi/RLHF-PPO-RewardModel-LLama3-1B-v2
Updated
bikalnetomi/rlhf-ppo-llama3-1B-Reward-model-lora-bikal
Updated
bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v2
Text Classification
•
3B
•
Updated
•
1
bikalnetomi/RLHF-PPO-RewardModel-LLama3-1B-v1.1
Text Classification
•
1B
•
Updated
•
1
bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1
Text Generation
•
Updated
•
1
bikalnetomi/rlhf-reward-model-ppo-llama32-3B-lora-r64-val-loss-0.45-bikal
Updated
bikalnetomi/rlhf-ppo-llama31-8B-Reward-model-lora-r128-bikal-merged
Text Generation
•
8B
•
Updated
•
3
bikalnetomi/rlhf-ppo-llama31-8B-Reward-model-lora-r8-bikal
Updated
bikalnetomi/rlhf-ppo-llama31-8B-Reward-model-lora-r16-bikal
Updated
bikalnetomi/rlhf-ppo-llama32-3B-Reward-model-lora-r64-bikal
Updated
bikalnetomi/rlhf-ppo-llama31-8B-Reward-model-lora-r256-bikal
Updated
bikalnetomi/rlhf-ppo-llama31-8B-Reward-model-lora-r128-bikal
Updated
bikalnetomi/rlhf-ppo-llama31-8B-Reward-model-lora-r64-bikal
Updated