ash56 commited on
Commit
3e10a9d
ยท
verified ยท
1 Parent(s): eeeba5c

Update Readme.md

Browse files
Files changed (1) hide show
  1. Readme.md +18 -15
Readme.md CHANGED
@@ -2,21 +2,24 @@
2
  license: apache-2.0
3
  ---
4
 
5
- This repository contains the model checkpoints related to the paper: [Less is More for Synthetic Speech Detection in the Wild](https://arxiv.org/abs/2502.05674)
6
 
7
  Dataset can be downloaded from [here](https://huggingface.co/datasets/ash56/ShiftySpeech/tree/main)
8
 
9
- ## ๐Ÿ”ฅ Key Features
10
- - 3000+ hours of synthetic speech
11
- - **Diverse Distribution Shifts**: The dataset spans **7 key distribution shifts**, including:
12
- - ๐Ÿ“– **Reading Style**
13
- - ๐ŸŽ™๏ธ **Podcast**
14
- - ๐ŸŽฅ **YouTube**
15
- - ๐Ÿ—ฃ๏ธ **Languages (Three different languages)**
16
- - ๐ŸŒŽ **Demographics (including variations in age, accent, and gender)**
17
- - **Multiple Speech Generation Systems**: Includes data synthesized from various **TTS models** and **vocoders**.
18
-
19
- ## ๐Ÿ’ก Why We Built This Dataset
20
- > Driven by advances in self-supervised learning for speech, state-of-the-art synthetic speech detectors have achieved low error rates on popular benchmarks such as ASVspoof. However, prior benchmarks do not address the wide range of real-world variability in speech. Are reported error rates realistic in real-world conditions? To assess detector failure modes and robustness under controlled distribution shifts, we introduce **ShiftySpeech**, a benchmark with more than 3000 hours of synthetic speech from 7 domains, 6 TTS systems, 12 vocoders, and 3 languages.
21
- >
22
- ๐Ÿš€ **Stay tuned! More model checkpoints will be available soon.**
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
+ This repository contains the model checkpoints related to the paper: *[Less is More for Synthetic Speech Detection in the Wild](https://arxiv.org/abs/2502.05674)*
6
 
7
  Dataset can be downloaded from [here](https://huggingface.co/datasets/ash56/ShiftySpeech/tree/main)
8
 
9
+ Checkpoints can be used in complementary to the official [GitHub](https://github.com/Ashigarg123/ShiftySpeech/tree/main) repository.
10
+
11
+ ๐Ÿš€ **Stay tuned! Hugging Face support for loading the model will be added soon.**
12
+
13
+ If you find the dataset or this resource helpful for your research, please cite our work:
14
+
15
+ ```bibtex
16
+ @misc{garg2025syntheticspeechdetectionwild,
17
+ title={Less is More for Synthetic Speech Detection in the Wild},
18
+ author={Ashi Garg and Zexin Cai and Henry Li Xinyuan and Leibny Paola Garcรญa-Perera and Kevin Duh and Sanjeev Khudanpur and Matthew Wiesner and Nicholas Andrews},
19
+ year={2025},
20
+ eprint={2502.05674},
21
+ archivePrefix={arXiv},
22
+ primaryClass={eess.AS},
23
+ url={https://arxiv.org/abs/2502.05674},
24
+ }
25
+ ```