Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.09009

Paper - Multimodal

Paper related to Multimodal Model - Research for a : Modular, Multimodal, Multi-Stream, Mixture of Expert, Universal Transformer, Matryoshka embedding

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Paper • 2412.15213 • Published Dec 19, 2024 • 26
No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 41
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 135
Autoregressive Video Generation without Vector Quantization

Paper • 2412.14169 • Published Dec 18, 2024 • 14

Interesting Papers

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published Nov 5, 2024 • 68
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details

Paper • 2411.03047 • Published Nov 5, 2024 • 9
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Paper • 2411.02336 • Published Nov 4, 2024 • 23
GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published Nov 4, 2024 • 20

Papers - Text - Fine-tuning - Loss - CCE - Triton

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Fine-tuning - Memory Reduction Techniques - Text

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Gemma 2 - Fine-tuning

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Mistral - NeMo - Fine-tuning

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Text - Training - Vocabulary Sorting

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Text - Training - Gradient Filtering

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Text - Train - Vocab - Dense Blocks Common Tokens

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Papers - Text - Training - Loss - Cuda - Triton - SRAM

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 47

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs