Aritra Roy Gosthipaty's picture

Aritra Roy Gosthipaty PRO

ariG23498

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

updated a model about 14 hours ago

ariG23498/layerskip-hf-smollm-135m-topv2

updated a Space about 15 hours ago

ariG23498/flux-edit

new activity about 15 hours ago

ariG23498/flux-edit:update seeding

View all activity

Articles

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)

Timm ❤️ Transformers: Use any timm model with transformers

Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo

Welcome PaliGemma 2 – New vision language models by Google

Faster Text Generation with Self-Speculative Decoding

Hugging Face Welcomes the Qwen2.5-Coder Series

PyTorchModelHubMixin: Bridging the Gap for Custom AI Models on Hugging Face

Hugging Face welcomes the Aya Expanse family of multilingual models

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

Llama can now see and run on your device - welcome Llama 3.2

Understanding Vector Quantization in VQ-VAE

Building DoRA Support for Embedding Layers in PEFT

How to communicate in a Pull Request?

The Workflow of PEFT

Announcing New Hugging Face and KerasHub integration

Conditional Probability

What is Probability?

Counting 'n' objects

Organizations

ariG23498's activity

upvoted an article about 20 hours ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 275

upvoted an article about 21 hours ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

By

•

1 day ago

• 19

upvoted a collection about 21 hours ago

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 27 items • Updated about 16 hours ago • 112

upvoted an article 2 days ago

Article

Welcome to Inference Providers on the Hub 🔥

3 days ago

• 171

upvoted a collection 3 days ago

Qwen2.5

The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7

upvoted an article 3 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

3 days ago

• 460

upvoted a collection 3 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 4 days ago • 287

upvoted an article 7 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

8 days ago

• 95

upvoted a paper 8 days ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 73

upvoted an article 8 days ago

Article

Mastering Long Contexts in LLMs with KVPress

By

•

8 days ago

• 58

upvoted a collection 8 days ago

InternVL2.5-MPO

Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 1 day ago • 26

upvoted an article 9 days ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16, 2024

• 40

upvoted an article 10 days ago

Article

Yay! Organizations can now publish blog Articles

By

•

10 days ago

• 30

upvoted a paper 10 days ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 44

upvoted a collection 10 days ago

DeepSeek-V3

3 items • Updated 25 days ago • 159

upvoted an article 14 days ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

15 days ago

• 37

upvoted 2 articles 15 days ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

15 days ago

• 61

Article

MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era

By

•

16 days ago

• 40

upvoted a collection 24 days ago

Cosmos

The collection of Cosmos models • 31 items • Updated 14 days ago • 253

upvoted an article 24 days ago

Article

Announcing NVIDIA Cosmos World Foundation Models

By

•

24 days ago

• 23