John's picture
4

John

tachyon-beep
ยท

AI & ML interests

Deep Reinforcement Learning Emergent Behaviour from Complex LLM Orchestration Emergent Cooperation in multi-agent Deep Reinforcement Learning.

Recent Activity

updated a model about 17 hours ago
tachyon-beep/mixtral-8x7b-r1-20250313
published a model about 17 hours ago
tachyon-beep/mixtral-8x7b-r1-20250313
updated a model 1 day ago
tachyon-beep/mixtral-8x7b-r1-20250312
View all activity

Organizations

None yet

tachyon-beep's activity

reacted to grimjim's post with ๐Ÿ‘€ 6 months ago
view post
Post
3253
I found this paper to be thought-provoking: "Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling" by Bansal, Hosseini, Agarwal, Tran, and Kazemi.
https://arxiv.org/abs/2408.16737
The direct implication is that smaller models could be used to create cost-effective synthetic datasets. And on that note, in the Gemma terms of use, Google explicitly claims no rights on outputs generated from those models, which means one is free to synthgen from the Gemma line. Meta's Llama 3 licence forbids synthetic generation of outputs if used to improve other models. Relevant Mistral, Qwen, and Yi models under the Apache 2.0 license are unrestricted for this purpose.
  • 2 replies
ยท