Aritra Roy Gosthipaty's picture

Aritra Roy Gosthipaty PRO

ariG23498

AI & ML interests

Deep Representation Learning

Recent Activity

updated a model about 14 hours ago
ariG23498/layerskip-hf-smollm-135m-topv2
updated a Space about 15 hours ago
ariG23498/flux-edit
new activity about 15 hours ago
ariG23498/flux-edit:update seeding
View all activity

Articles

Organizations

Hugging Face's profile picture Google's profile picture Notebooks-explorers's profile picture PyTorch Image Models's profile picture Keras's profile picture Hugging Test Lab's profile picture Hugging Face Fellows's profile picture Probing ViTs's profile picture TrystAI's profile picture PyImageSearch's profile picture Keras Dreambooth Event's profile picture Hugging Face OSS Metrics's profile picture Blog-explorers's profile picture ZeroGPU Explorers's profile picture kotol's profile picture gg-hf's profile picture MLX Community's profile picture IBM Granite's profile picture Open Generative Fill's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture nltpt's profile picture nltpt-q's profile picture qrias's profile picture Hugging Face Science's profile picture open/ acc's profile picture wut?'s profile picture LLM from Scratch's profile picture

ariG23498's activity

reacted to burtenshaw's post with ๐Ÿš€ 9 days ago
view post
Post
3482
๐Ÿšง Work in Progress! ๐Ÿšง

๐Ÿ‘ทโ€โ™€๏ธ We're working hard on getting the official agents course ready for the 50,000 students that have signed up.

If you want to contribute to the discussion, I started these community posts. Looking forward to hearing from you:

- smolagents unit in the agents course - agents-course/README#7
- LlamaIndex Unit in the agents course - agents-course/README#6
- LangChain and LangGraph unit in the agents course - agents-course/README#5
- Real world use cases in the agents course - agents-course/README#8


posted an update 11 days ago
view post
Post
1893
Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
posted an update 14 days ago
reacted to burtenshaw's post with ๐Ÿš€๐Ÿ”ฅ 15 days ago
view post
Post
39333
Weโ€™re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents
ยท
reacted to burtenshaw's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
2478
Quick update from week 1 of smol course. The community is taking the driving seat and using the material for their own projects. If you want to do the same, join in!

- we have ongoing translation projects in Korean, Vietnamese, Portuguese, and Spanish
- 3 chapters are ready for students. On topics like, instruction tuning, preference alignment, and parameter efficient fine tuning
- 3 chapters are in progress on evaluation, vision language models, and synthetic data.
- around 780 people have forked the repo to use it for learning, teaching, sharing.

โญ๏ธ Next step is to support people that want to use the course for teaching, content creation, internal knowledge sharing, or anything. If you're into this. Drop an issue or PR

REPO: https://buff.ly/3ZCMKX2
discord channel: https://buff.ly/4f9F8jA
posted an update about 2 months ago
reacted to rwightman's post with ๐Ÿš€ about 2 months ago
view post
Post
1416
There's a new timm release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers() and new way to register optimizers and their attributes. As always you can use an timm optimizer like a torch one, just replace torch.optim with timm.optim

New optimizers include:
* AdafactorBigVision - adfactorbv
* ADOPT - adopt / adoptw (decoupled decay)
* MARS - mars
* LaProp - laprop
* Cautious Optimizers - a modification to all of the above, prefix with c as well as cadamw, cnadamw, csgdw, clamb, crmsproptf

I shared some caution comparisons in this model repo: rwightman/timm-optim-caution

For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim

  • 3 replies
ยท
reacted to davidberenstein1957's post with ๐Ÿš€๐Ÿง ๐Ÿ‘ about 2 months ago
view post
Post
3450
The Data Is Better Together community is set to release the first Apache 2 licensed image preference dataset!

Great work and let's give this a final push :)

@aashish1904 congrats on your month of HF pro. There is more to win during this sprint!

@aashish1904 @AnyaDesdein @davidberenstein1957 @Malalatiana @beta3 @fffiloni @munish0838 @Reza2kn @bbunzeck @Creazycreator @andrei-saceleanu @jafhaponiuk @rca-etl @kf120 @burtenshaw @mmhamdy @grib0ed0v @Doopus @AnyaDes @ttkap @Xceron @Lewox @davanstrien @Azazelle @adirik @Ashish08 @AntonVic @kenantang @sdiazlor @g-ronimo @dennis-rall @prithivMLmods @girtss3 @flozi00 @WaveCut @Taylor658 @Wildminder @Sara9999 @phaelishall @sararob @dvilasuero @pgabrys @plaguss @CDS899 @timajwilliams @rudzinskimaciej @pavel-ai @aggr8 @ignacioct @MouseAI @Leeps @MaksKul @NicolasDmln @Muinez @kusht55 @caiolang @Jakub-Brand24 @loamy @Demijan @eliab96 @Viewegger @JosephCatrambone @p1atdev @mrshu @o639 @Targezed @Aviv-anthonnyolime @thliang01 @Ahmed-Amine @glards @pranaykoppula @nataliaElv @MaPirlet @alvarobartt @gabrielmbmb @zlicastro @Jaydip @Chouettecheveche @lilcheaty @ruyrdiaz @robintema @fdaudens @ggcristian @a-r-r-o-w @pates @joheras @stopsatgreen @bezo97 @chachi902 @iamyann @liamcripwell @dmb23 @korbih @anonymous7743 @akbdx18 @OVAWARE @severo @akontra @lichorosario @lhoestq @SebastianBodza @Vishnou @ameerazam08 @appoose @Mukei @mearco @joaquincabezas @Fizzarolli @thomastraum @igortopolski @OxxoCodes @patrickfleith @asoria @bn22 @sitammeur @Krodolf @bergr7f @Sbxxn @wietsevenema @sugatoray @Iamladi @MikeTrizna @feveromo @mokady @Bolero @prath @Dowwie @kfahn @decodingchris @alili2050 @RahulRaman @yzimmermann @Ameeeee @ecyht2 @MattMC001 @hemanthkumarak @Thegorgibus @akos2 @LawRun @ramithuh @SuperMuel @sjans @peterizsak @mosama @Eyel @mtr3 @cfahlgren1 @legentil @clem @Citaman @Aurelien-Morgan @AntoineBourgois @TotoB12 @Stanmey @osanseviero @multimodalart @maxiw @ariG23498 @ngk89 @femboysLover @dvs @tacohiddink @blanchon @DavidJimenez
  • 1 reply
ยท
reacted to clem's post with ๐Ÿš€ about 2 months ago
view post
Post
4596
Six predictions for AI in 2025 (and a review of how my 2024 predictions turned out):

- There will be the first major public protest related to AI
- A big company will see its market cap divided by two or more because of AI
- At least 100,000 personal AI robots will be pre-ordered
- China will start to lead the AI race (as a consequence of leading the open-source AI race).
- There will be big breakthroughs in AI for biology and chemistry.
- We will begin to see the economic and employment growth potential of AI, with 15M AI builders on Hugging Face.

How my predictions for 2024 turned out:

- A hyped AI company will go bankrupt or get acquired for a ridiculously low price
โœ… (Inflexion, AdeptAI,...)

- Open-source LLMs will reach the level of the best closed-source LLMs
โœ… with QwQ and dozens of others

- Big breakthroughs in AI for video, time-series, biology and chemistry
โœ… for video ๐Ÿ”ดfor time-series, biology and chemistry

- We will talk much more about the cost (monetary and environmental) of AI
โœ…Monetary ๐Ÿ”ดEnvironmental (๐Ÿ˜ข)

- A popular media will be mostly AI-generated
โœ… with NotebookLM by Google

- 10 millions AI builders on Hugging Face leading to no increase of unemployment
๐Ÿ”œcurrently 7M of AI builders on Hugging Face
ยท
reacted to merve's post with ๐Ÿš€๐Ÿ”ฅ๐Ÿ˜Ž about 2 months ago
view post
Post
2670
small but mighty ๐Ÿ”ฅ
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM ๐Ÿซฐ๐Ÿป also with gradient accumulation simulated batch size is 16 โœจ
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work ๐Ÿ’ https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
reacted to merve's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
2917
Last week we were blessed with open-source models! A recap ๐Ÿ’
merve/nov-29-releases-674ccc255a57baf97b1e2d31

๐Ÿ–ผ๏ธ Multimodal
> At Hugging Face we released SmolVLM, a performant and efficient smol vision language model ๐Ÿ’—
> Show Lab released ShowUI-2B: new vision-language-action model to build GUI/web automation agents ๐Ÿค–
> Rhymes AI has released the base model of Aria: Aria-Base-64K and Aria-Base-8K with their respective context length
> ViDoRe team released ColSmolVLM: A new ColPali-like retrieval model based on SmolVLM
> Dataset: Llava-CoT-o1-Instruct: new dataset labelled using Llava-CoT multimodal reasoning model๐Ÿ“–
> Dataset: LLaVA-CoT-100k dataset used to train Llava-CoT released by creators of Llava-CoT ๐Ÿ“•

๐Ÿ’ฌ LLMs
> Qwen team released QwQ-32B-Preview, state-of-the-art open-source reasoning model, broke the internet ๐Ÿ”ฅ
> AliBaba has released Marco-o1, a new open-source reasoning model ๐Ÿ’ฅ
> NVIDIA released Hymba 1.5B Base and Instruct, the new state-of-the-art SLMs with hybrid architecture (Mamba + transformer)

โฏ๏ธ Image/Video Generation
> Qwen2VL-Flux: new image generation model based on Qwen2VL image encoder, T5 and Flux for generation
> Lightricks released LTX-Video, a new DiT-based video generation model that can generate 24 FPS videos at 768x512 res โฏ๏ธ
> Dataset: Image Preferences is a new image generation preference dataset made with DIBT community effort of Argilla ๐Ÿท๏ธ

Audio
> OuteAI released OuteTTS-0.2-500M new multilingual text-to-speech model based on Qwen-2.5-0.5B trained on 5B audio prompt tokens
reacted to thomwolf's post with ๐Ÿ”ฅ 2 months ago
posted an update 3 months ago
reacted to m-ric's post with ๐Ÿš€ 3 months ago
view post
Post
2010
๐ŸŒŸ๐ŸŒŽ Cohere releases Aya 8B & 32B: SOTA multilingual models for 23 languages !

How did they manage to beat top contenders while also adding 23 languages?

๐Ÿ”„ ๐—ง๐—ฟ๐—ฎ๐—ถ๐—ป ๐—ผ๐—ป ๐˜€๐˜†๐—ป๐˜๐—ต๐—ฒ๐˜๐—ถ๐—ฐ ๐—ฑ๐—ฎ๐˜๐—ฎ:
โ€ข Synthetic data has been said to cause model-collapse after too much training
โ€ข Cohere has introduced "data arbitrage" to prevent this by strategically sampling from a pool of several teacher models instead of one single teacher
โ€ข First train a model pool for each different groups of languages, and employ an internal Reward Model named "Arbiter" to evaluate and select the optimal generation. Then only the best generation is kept as the final completion for each prompt
โžก๏ธ This process is particularly effective for multilingual setting, where no single teacher model performs in all languages : here "Multilingual Arbitrage" singlehandedly improves win rates of the 8B model vs Gemma-2-9B by 10 points!

๐Ÿงฉ ๐—จ๐˜€๐—ฒ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น ๐—บ๐—ฒ๐—ฟ๐—ด๐—ถ๐—ป๐—ด: Rather than struggling to find the right mix of data in training a single model for multilingual use, just train language specific models then merge them!
โ€ข Maximize diversity between merged checkpoints by training each on different language families.
โ€ข Experimented fancy techniques (SLERP, TIES, DARE-TIES) but found out weighted averaging to be the most consistent!
โžก๏ธ Merging had 3x more gains at high 35B scale vs the 8B scale - consistent with literature findings that merging is more effective at scale

โšก๏ธ ๐—š๐—ฟ๐—ฒ๐—ฎ๐˜ ๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐—ป๐—ฐ๐—ฒ: Automatic evaluations on Arena-Hard-Auto dataset:
โžก๏ธ Aya Expanse 8B beats models from its weight class such as Gemma 2 9B, Llama 3.1 8B, and the recent Ministral 8B, with win rates ranging from 60.4% to 70.6%
โžก๏ธ Aya Expanse 32B outperforms Gemma 2 27B, Mistral 8x22B, and Llama 3.1 70B (2x its size)
โ€ข โš ๏ธ But this performance eval comes from only one benchmark! Let's wait for Open LLM leaderboard evals;

๐Ÿ”’ CC by NC license

Blog post here: https://huggingface.co/blog/aya-expanse
posted an update 3 months ago