Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1412.6544

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it!

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4
Convergent Learning: Do different neural networks learn the same representations?

Paper • 1511.07543 • Published Nov 24, 2015 • 2
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

Paper • 1909.11299 • Published Sep 25, 2019 • 2
Model Fusion via Optimal Transport

Paper • 1910.05653 • Published Oct 12, 2019 • 1

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4

Model Merging Papers

Collection of relevant papers about model merging

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4
Averaging Weights Leads to Wider Optima and Better Generalization

Paper • 1803.05407 • Published Mar 14, 2018 • 2
Merging Models with Fisher-Weighted Averaging

Paper • 2111.09832 • Published Nov 18, 2021 • 1
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 6

google-bert/bert-base-uncased

Fill-Mask • Updated Feb 19, 2024 • 86.7M • • 2.17k
sentence-transformers/embedding-training-data

Updated Sep 11, 2024 • 1.04k • 122
PygmalionAI/pygmalion-6b

Text Generation • Updated Jan 13, 2023 • 4.16k • 745
Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4
TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • Updated Mar 17, 2024 • 1.13M • • 1.18k

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs