File size: 1,208 Bytes
38da468
 
 
 
 
 
 
 
 
077c7ae
3f277ef
 
 
 
 
 
 
 
 
 
 
 
f8d5c98
 
 
4a10d2d
f8d5c98
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
title: README
emoji: 🦀
colorFrom: pink
colorTo: gray
sdk: static
pinned: false
---

# [TTSDS Benchmark](https://ttsdsbenchmark.com)

As many recent Text-to-Speech (TTS) models have shown, synthetic audio can be close to real human speech.
However, traditional evaluation methods for TTS systems need an update to keep pace with these new developments.
Our TTSDS benchmark assesses the quality of synthetic speech by considering factors like prosody, speaker identity, and intelligibility.
By comparing these factors with both real speech and noise datasets, we can better understand how synthetic speech stacks up.

## More information
More details can be found in our paper [*TTSDS -- Text-to-Speech Distribution Score*](https://arxiv.org/abs/2407.12707).

## Reproducibility
To reproduce our results, check out our repository [here](https://github.com/ttsds/ttsds).

## Citation

```
@misc{minixhofer2024ttsds,
      title={TTSDS -- Text-to-Speech Distribution Score}, 
      author={Christoph Minixhofer and Ondřej Klejch and Peter Bell},
      year={2024},
      eprint={2407.12707},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2407.12707}, 
}
```