arxiv:2003.04276

How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS

Published on Mar 9, 2020

Authors:

Abstract

Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.a. the super-net. Since heuristics and hyperparameters substantially vary across different methods, a fair comparison between them can only be achieved by systematically analyzing the influence of these factors. In this paper, we therefore provide a systematic evaluation of the heuristics and hyperparameters that are frequently employed by weight-sharing NAS algorithms. Our analysis uncovers that some commonly-used heuristics for super-net training negatively impact the correlation between <PRE_TAG>super-net and stand-alone performance</POST_TAG>, and evidences the strong influence of certain hyperparameters and architectural choices. Our code and experiments set a strong and reproducible baseline that future works can build on.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2003.04276 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2003.04276 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2003.04276 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.