Simon Holm

simonholm

AI & ML interests

None yet

Recent Activity

upvoted an article about 16 hours ago

🦸🏻#9: Does AI Remember? The Role of Memory in Agentic Workflows

liked a model about 20 hours ago

deepseek-ai/deepseek-vl2-small

liked a Space 1 day ago

deepseek-ai/deepseek-vl2-small

View all activity

Organizations

None yet

simonholm's activity

upvoted an article about 16 hours ago

Article

🦸🏻#9: Does AI Remember? The Role of Memory in Agentic Workflows

•

5 days ago

• 7

liked a model about 20 hours ago

deepseek-ai/deepseek-vl2-small

Image-Text-to-Text • Updated Dec 18, 2024 • 14.1k • 102

liked a Space 1 day ago

189

Chat with DeepSeek-VL2-small

🌍

upvoted an article 2 days ago

Article

Open-source DeepResearch – Freeing our search agents

3 days ago

• 701

reacted to m-ric's post with 🚀🔥 2 days ago

Post

7270

Introducing 𝗼𝗽𝗲𝗻 𝗗𝗲𝗲𝗽-𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 by Hugging Face! 💥

OpenAI's latest agentic app Deep Research seems really good... But it's closed, as usual.

⏱️ So with a team of cracked colleagues, we set ourselves a 24hours deadline to replicate and open-source Deep Research! ⏱️

➡️ We built open-Deep-Research, an entirely open agent that can: navigate the web autonomously, scroll and search through pages, download and manipulate files, run calculation on data...

We aimed for the best performance: are the agent's answers really rigorous?

On GAIA benchmark, Deep Research had 67% accuracy on the validation set.
➡️ open Deep Research is at 55% (powered by o1), it is:
- the best pass@1 solution submitted
- the best open solution 💪💪

And it's only getting started ! Please jump in, drop PRs, and let's bring it to the top !

Read the blog post 👉 https://huggingface.co/blog/open-deep-research

liked a dataset 2 days ago

cognitivecomputations/dolphin-r1

Viewer • Updated 8 days ago • 814k • 1.79k • 201

upvoted an article 4 days ago

Article

Welcome to Inference Providers on the Hub 🔥

10 days ago

• 266

liked a Space 4 days ago

GOT OCR Transformers

📷

Demo of GOT-OCR 2.0's Transformers implementation

upvoted an article 4 days ago

Article

Open-R1: Update #1

and 7 others •

5 days ago

• 239

reacted to Kseniase's post with 🤗 5 days ago

Post

4702

8 Free Sources on Reinforcement Learning

With the phenomenon of DeepSeek-R1's top reasoning capabilities, we all saw the true power of RL. At its core, RL is a type of machine learning where a model/agent learns to make decisions by interacting with an environment to maximize a reward. RL learns through trial and error, receiving feedback in the form of rewards or penalties.

Here's a list of free sources that will help you dive into RL and how to use it:

1. "Reinforcement Learning: An Introduction" book by Richard S. Sutton and Andrew G. Barto -> https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

2. Hugging Face Deep Reinforcement Learning Course -> https://huggingface.co/learn/deep-rl-course/unit0/introduction
You'll learn how to train agents in unique environments, using best libraries, share your results, compete in challenges, and earn a certificate.

3. OpenAI Spinning Up in Deep RL -> https://spinningup.openai.com/en/latest/index.html
A comprehensive overview of RL with many useful resources

4. "Reinforcement Learning and Optimal Control" books, video lectures and course material by Dimitri P. Bertsekas from ASU -> https://web.mit.edu/dimitrib/www/RLbook.html
Explores approximate Dynamic Programming (DP) and RL with key concepts and methods like rollout, tree search, and neural network training for RL and more.

5. RL Course by David Silver (Google DeepMind) -> https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLqYmG7hTraZDM-OYHWgPeb
Many recommend these video lectures as a good foundation

6. RL theory seminars -> https://sites.google.com/view/rltheoryseminars/home?authuser=0
Provides virtual seminars from different experts about RL advancements

7. "Reinforcement Learning Specialization" (a 4-course series on Coursera) -> https://www.coursera.org/learn/fundament

8. Concepts: RLHF, RLAIF, RLEF, RLCF -> https://www.turingpost.com/p/rl-f
Our flashcards easily explain what are these four RL approaches with different feedback