sriram vasudevan's picture
4 3

sriram vasudevan

sriramvasu
ยท

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago
deepseek-ai/DeepSeek-R1
View all activity

Organizations

None yet

sriramvasu's activity

reacted to merve's post with ๐Ÿš€ 5 days ago
view post
Post
2149
smolagents can see ๐Ÿ”ฅ
we just shipped vision support to smolagents ๐Ÿค— agentic computers FTW

you can now:
๐Ÿ’ป let the agent get images dynamically (e.g. agentic web browser)
๐Ÿ“‘ pass images at the init of the agent (e.g. chatting with documents, filling forms automatically etc)
with few LoC change! ๐Ÿคฏ
you can use transformers models locally (like Qwen2VL) OR plug-in your favorite multimodal inference provider (gpt-4o, antrophic & co) ๐Ÿค 

read our blog http://hf.co/blog/smolagents-can-see
reacted to chansung's post with ๐Ÿ‘ 5 days ago
view post
Post
1962
Simple summarization of Evolving Deeper LLM Thinking (Google DeepMind)

The process starts by posing a question.
1) The LLM generates initial responses.
2) These generated responses are evaluated according to specific criteria (program-based checker).
3) The LLM critiques the evaluated results.
4) The LLM refines the responses based on the evaluation, critique, and original responses.

The refined response is then fed back into step 2). If it meets the criteria, the process ends. Otherwise, the algorithm generates more responses based on the refined ones (with some being discarded, some remaining, and some responses potentially being merged).

Through this process, it demonstrated excellent performance in complex scheduling problems (travel planning, meeting scheduling, etc.). It's a viable method for finding highly effective solutions in specific scenarios.

However, there are two major drawbacks:
๐Ÿค” An excessive number of API calls are required. (While the cost might not be very high, it leads to significant latency.)
๐Ÿค” The evaluator is program-based. (This limits its use as a general method. It could potentially be modified/implemented using LLM as Judge, but that would introduce additional API costs for evaluation.)

https://arxiv.org/abs/2501.09891
upvoted an article about 1 month ago
view article
Article

Fine-Tuning LLMs: Supervised Fine-Tuning and Reward Modelling

By rishiraj โ€ข
โ€ข 4