ash56 commited on
Commit
e16e98b
Β·
verified Β·
1 Parent(s): 068a683

Create Readme.md

Browse files
Files changed (1) hide show
  1. Readme.md +20 -0
Readme.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ This repository contains the model checkpoints related to the paper: [Less is More for Synthetic Speech Detection in the Wild](https://arxiv.org/abs/2502.05674)
6
+
7
+ ## πŸ”₯ Key Features
8
+ - 3000+ hours of synthetic speech
9
+ - **Diverse Distribution Shifts**: The dataset spans **7 key distribution shifts**, including:
10
+ - πŸ“– **Reading Style**
11
+ - πŸŽ™οΈ **Podcast**
12
+ - πŸŽ₯ **YouTube**
13
+ - πŸ—£οΈ **Languages (Three different languages)**
14
+ - 🌎 **Demographics (including variations in age, accent, and gender)**
15
+ - **Multiple Speech Generation Systems**: Includes data synthesized from various **TTS models** and **vocoders**.
16
+
17
+ ## πŸ’‘ Why We Built This Dataset
18
+ > Driven by advances in self-supervised learning for speech, state-of-the-art synthetic speech detectors have achieved low error rates on popular benchmarks such as ASVspoof. However, prior benchmarks do not address the wide range of real-world variability in speech. Are reported error rates realistic in real-world conditions? To assess detector failure modes and robustness under controlled distribution shifts, we introduce **ShiftySpeech**, a benchmark with more than 3000 hours of synthetic speech from 7 domains, 6 TTS systems, 12 vocoders, and 3 languages.
19
+ >
20
+ > πŸš€ **Stay tuned! More model checkpoints will be available soon.**