Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.14394

Papers - University - MIT

One-step Diffusion with Distribution Matching Distillation

Paper • 2311.18828 • Published Nov 30, 2023 • 3
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 79
Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1, 2024 • 13
Locating and Editing Factual Associations in GPT

Paper • 2202.05262 • Published Feb 10, 2022 • 1

Papers - SAM - Segment Anything Model

Prompt me a Dataset: An investigation of text-image prompting for historical image dataset creation using foundation models

Paper • 2309.01674 • Published Sep 4, 2023 • 2
Segment Anything

Paper • 2304.02643 • Published Apr 5, 2023 • 4
EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Paper • 2403.18118 • Published Mar 26, 2024 • 12
A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22, 2024 • 21

Papers - Image - Clip

Demystifying CLIP Data

Paper • 2309.16671 • Published Sep 28, 2023 • 20
Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28, 2024 • 12
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1, 2024 • 22
On the Scalability of Diffusion-based Text-to-Image Generation

Paper • 2404.02883 • Published Apr 3, 2024 • 19

Papers - Image - Dino

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology

Paper • 2203.00585 • Published Mar 1, 2022 • 2
Emerging Properties in Self-Supervised Vision Transformers

Paper • 2104.14294 • Published Apr 29, 2021 • 3
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

Paper • 2404.06903 • Published Apr 10, 2024 • 19
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11, 2024 • 32

Papers - ResNet

Wide Residual Networks

Paper • 1605.07146 • Published May 23, 2016 • 2
Characterizing signal propagation to close the performance gap in unnormalized ResNets

Paper • 2101.08692 • Published Jan 21, 2021 • 2
Pareto-Optimal Quantized ResNet Is Mostly 4-bit

Paper • 2105.03536 • Published May 7, 2021 • 2
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

Paper • 2106.01548 • Published Jun 3, 2021 • 2

Multi-agent LLMs

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4, 2024 • 37
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 18
A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22, 2024 • 21

DocGraphLM: Documental Graph Language Model for Information Extraction

Paper • 2401.02823 • Published Jan 5, 2024 • 36
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 64
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 180
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

Paper • 2309.01131 • Published Sep 3, 2023 • 1

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs