Manish Kumar Pandey

Manish-GenAI

AI & ML interests

#GraphML, #GeometricDL, #3DComputerVision, #DiffusionModels, #GANs, #Generative AI #ComputerVision,#ML ,#RL, #LLM, #MultiModal Fusion #GenerativeFlow Networks

Recent Activity

liked a model about 14 hours ago

rasbt/llama-3.2-from-scratch

upvoted a collection 3 days ago

March 2025 - Open releases from the Chinese community

upvoted a paper 18 days ago

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

View all activity

Organizations

Manish-GenAI's activity

liked a model about 14 hours ago

rasbt/llama-3.2-from-scratch

Updated about 10 hours ago • 75

upvoted a collection 3 days ago

March 2025 - Open releases from the Chinese community

Collection

30 items • Updated 4 days ago • 11

upvoted a paper 18 days ago

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

Paper • 2503.01935 • Published 29 days ago • 25

liked a Space 21 days ago

116

smolagents LLM leaderboard

🏆

A leaderboard for LLMs powering smolagents

upvoted a paper 21 days ago

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Paper • 2503.04872 • Published 26 days ago • 14

liked a model about 2 months ago

allenai/Llama-3.1-Tulu-3.1-8B

Text Generation • Updated Feb 10 • 3.48k • 30

upvoted a paper about 2 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 215

upvoted an article about 2 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.2k

reacted to Kseniase's post with ❤️ about 2 months ago

Post

4936

8 Free Sources on Reinforcement Learning

With the phenomenon of DeepSeek-R1's top reasoning capabilities, we all saw the true power of RL. At its core, RL is a type of machine learning where a model/agent learns to make decisions by interacting with an environment to maximize a reward. RL learns through trial and error, receiving feedback in the form of rewards or penalties.

Here's a list of free sources that will help you dive into RL and how to use it:

1. "Reinforcement Learning: An Introduction" book by Richard S. Sutton and Andrew G. Barto -> https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

2. Hugging Face Deep Reinforcement Learning Course -> https://huggingface.co/learn/deep-rl-course/unit0/introduction
You'll learn how to train agents in unique environments, using best libraries, share your results, compete in challenges, and earn a certificate.

3. OpenAI Spinning Up in Deep RL -> https://spinningup.openai.com/en/latest/index.html
A comprehensive overview of RL with many useful resources

4. "Reinforcement Learning and Optimal Control" books, video lectures and course material by Dimitri P. Bertsekas from ASU -> https://web.mit.edu/dimitrib/www/RLbook.html
Explores approximate Dynamic Programming (DP) and RL with key concepts and methods like rollout, tree search, and neural network training for RL and more.

5. RL Course by David Silver (Google DeepMind) -> https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLqYmG7hTraZDM-OYHWgPeb
Many recommend these video lectures as a good foundation

6. RL theory seminars -> https://sites.google.com/view/rltheoryseminars/home?authuser=0
Provides virtual seminars from different experts about RL advancements

7. "Reinforcement Learning Specialization" (a 4-course series on Coursera) -> https://www.coursera.org/learn/fundament

8. Concepts: RLHF, RLAIF, RLEF, RLCF -> https://www.turingpost.com/p/rl-f
Our flashcards easily explain what are these four RL approaches with different feedback