Any publication?

#5
by sappho192 - opened

Hi, thank you for releasing this model into public.

I'd like to study what changes were made in this 2.0 version compared to the previous model, but I couldn't find any papers related to this.
Is there any way I can find out in detail what has changed?

Thanks in advance.

NVIDIA org

The biggest diff in the training dataset, plus slightly different augmentations. The training data of 2.0 version includes non-speech audio samples to help the model distinguish between speech and non-speech sounds (such as coughing, laughter, and breathing, etc.)

You can refer to MarbleNet Paper: https://arxiv.org/pdf/2010.13886

@naymaraq
Thank you! That's a nice improvement :)

sappho192 changed discussion status to closed

Sign up or log in to comment