-
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 27 -
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 115 -
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Paper • 2402.07827 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2402.16840
-
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14 -
distilbert/distilbert-base-uncased-finetuned-sst-2-english
Text Classification • Updated • 6.68M • • 707 -
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Paper • 2401.14112 • Published • 20 -
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
Paper • 2401.04092 • Published • 21
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 147 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 56 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 118 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 663k • 2.94k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 52 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 31
-
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming
Paper • 2311.07689 • Published • 8 -
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 15 -
SparQ Attention: Bandwidth-Efficient LLM Inference
Paper • 2312.04985 • Published • 39 -
Aligning Large Language Models with Counterfactual DPO
Paper • 2401.09566 • Published • 2