ash56
/

ShiftySpeech

Model card Files Files and versions Community

ash56 commited on 27 days ago

Commit

3e10a9d

·

verified ·

1 Parent(s): eeeba5c

Update Readme.md

Files changed (1) hide show

Readme.md +18 -15

Readme.md CHANGED Viewed

@@ -2,21 +2,24 @@
 license: apache-2.0
 ---
-This repository contains the model checkpoints related to the paper: [Less is More for Synthetic Speech Detection in the Wild](https://arxiv.org/abs/2502.05674)
 Dataset can be downloaded from [here](https://huggingface.co/datasets/ash56/ShiftySpeech/tree/main)
-## 🔥 Key Features
-- 3000+ hours of synthetic speech
-- **Diverse Distribution Shifts**: The dataset spans **7 key distribution shifts**, including:
-  - 📖 **Reading Style**
-  - 🎙️ **Podcast**
-  - 🎥 **YouTube**
-  - 🗣️ **Languages (Three different languages)**
-  - 🌎 **Demographics (including variations in age, accent, and gender)**
-- **Multiple Speech Generation Systems**: Includes data synthesized from various **TTS models** and **vocoders**.
-## 💡 Why We Built This Dataset
-> Driven by advances in self-supervised learning for speech, state-of-the-art synthetic speech detectors have achieved low error rates on popular benchmarks such as ASVspoof. However, prior benchmarks do not address the wide range of real-world variability in speech. Are reported error rates realistic in real-world conditions? To assess detector failure modes and robustness under controlled distribution shifts, we introduce **ShiftySpeech**, a benchmark with more than 3000 hours of synthetic speech from 7 domains, 6 TTS systems, 12 vocoders, and 3 languages.
->
-🚀 **Stay tuned! More model checkpoints will be available soon.**

 license: apache-2.0
 ---
+This repository contains the model checkpoints related to the paper: *[Less is More for Synthetic Speech Detection in the Wild](https://arxiv.org/abs/2502.05674)*
 Dataset can be downloaded from [here](https://huggingface.co/datasets/ash56/ShiftySpeech/tree/main)
+Checkpoints can be used in complementary to the official [GitHub](https://github.com/Ashigarg123/ShiftySpeech/tree/main) repository.
+🚀 **Stay tuned! Hugging Face support for loading the model will be added soon.**
+If you find the dataset or this resource helpful for your research, please cite our work:
+```bibtex
+@misc{garg2025syntheticspeechdetectionwild,
+      title={Less is More for Synthetic Speech Detection in the Wild},
+      author={Ashi Garg and Zexin Cai and Henry Li Xinyuan and Leibny Paola García-Perera and Kevin Duh and Sanjeev Khudanpur and Matthew Wiesner and Nicholas Andrews},
+      year={2025},
+      eprint={2502.05674},
+      archivePrefix={arXiv},
+      primaryClass={eess.AS},
+      url={https://arxiv.org/abs/2502.05674},
+}
+```