Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
created file structure, added dataset
Browse files- .gitignore +3 -0
- app.py +0 -0
- audio_utils.py +0 -0
- dataset/README.md +69 -0
- dataset/analysis.zip +3 -0
- dataset/licenses.txt +0 -0
- dataset/one_shot_percussive_sounds.zip +3 -0
- inference.py +0 -0
.gitignore
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
/.env
|
2 |
+
._*
|
3 |
+
/dataset/unzipped
|
app.py
ADDED
File without changes
|
audio_utils.py
ADDED
File without changes
|
dataset/README.md
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Freesound One-Shot Percussive Sounds Dataset
|
2 |
+
|
3 |
+
|
4 |
+
This dataset contains 10254 one-shot (single event) percussive sounds from Freesound.org and the corresponding timbral analysis. These were used to train the generative model for "Neural Percussive Synthesis Parameterised by High-Level Timbral Features".
|
5 |
+
|
6 |
+
## Dataset Construction
|
7 |
+
|
8 |
+
To collect this dataset, the following steps were performed:
|
9 |
+
|
10 |
+
* Freesound was queried with words associated with percussive instruments, such as "percussion", "kick", "wood" or "clave". Only sounds with less than one second of [effective duration](https://essentia.upf.edu/reference/std_EffectiveDuration.html) were selected.
|
11 |
+
|
12 |
+
* This stage retrieved some audio clips that contained multiple sound events or that were of low quality.
|
13 |
+
Therefore, we listened to all the retrieved sounds and manually discarded the sounds presenting one of these characteristics. For this, the [percussive-annotator](https://github.com/xavierfav/percussive-annotator) was used.
|
14 |
+
|
15 |
+
* The sounds were then cut or padded to have 1-second length, normalized and downsampled to 16kHz.
|
16 |
+
|
17 |
+
* Finally, the sounds were analyzed with the [AudioCommons Extractor](https://github.com/AudioCommons/ac-audio-extractor), to obtain the AudioCommons timbral descriptors. This information is contained in the 'analysis' folder.
|
18 |
+
|
19 |
+
|
20 |
+
## Dataset Organisation
|
21 |
+
|
22 |
+
The dataset contains two folders and two files in the root directory:
|
23 |
+
|
24 |
+
* 'one_shot_percussive_sounds' encloses the pre-processed audio files. These are named '<freesound_sound_id>.wav'
|
25 |
+
|
26 |
+
* 'analysis' holds the AudioCommons analysis files for each of the sounds in the dataset. This analysis is stored as a .json file, named '<freesound_sound_id>_analysis.json', with a key for each of the features extracted.
|
27 |
+
|
28 |
+
* Two more files are present in the root directory of the dataset: this 'README' and the 'licenses.json'. The latter one is a '.json' file containing the name, the username of the uploader and the license for each of the sounds in the dataset.
|
29 |
+
|
30 |
+
|
31 |
+
## Authors and Contact
|
32 |
+
|
33 |
+
This dataset was developed by Ant贸nio Ramires, Pritish Chadna, Xavier Favory, Emilia G贸mez and Xavier Serra.
|
34 |
+
|
35 |
+
Any questions related to this dataset please contact:
|
36 |
+
|
37 |
+
Ant贸nio Ramires
|
38 |
+
|
39 | |
40 |
+
|
41 | |
42 |
+
|
43 |
+
|
44 |
+
## References
|
45 |
+
|
46 |
+
Please cite this paper if you use this dataset:
|
47 |
+
|
48 |
+
```
|
49 |
+
|
50 |
+
@inproceedings{ramires2020,
|
51 |
+
author = "Antonio Ramires and Pritish Chandna and Xavier Favory and Emilia G贸mez and Xavier Serra",
|
52 |
+
title = "Neural Percussive Synthesis Parametrerised by High-Level Timbral Features",
|
53 |
+
booktitle = "Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP)",
|
54 |
+
year = "2020"
|
55 |
+
|
56 |
+
}
|
57 |
+
|
58 |
+
```
|
59 |
+
|
60 |
+
|
61 |
+
## Acknowledgements
|
62 |
+
|
63 |
+
This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk艂odowska-Curie grant agreement No. 765068 (MIP-Frontiers).
|
64 |
+
|
65 |
+
This work has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 770376 (TROMPA).
|
66 |
+
|
67 |
+
<img src="https://upload.wikimedia.org/wikipedia/commons/b/b7/Flag_of_Europe.svg" height="64" hspace="20">
|
68 |
+
|
69 |
+
|
dataset/analysis.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e27faf24d3650e9541fb2f76c0ec7bd2be79672583c45aa17bc2cb830cb50fd8
|
3 |
+
size 5610013
|
dataset/licenses.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
dataset/one_shot_percussive_sounds.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c45401b3cbdd56606f0d9e5e494a18efbae1ca830f835504dccc316c1934720c
|
3 |
+
size 112614838
|
inference.py
ADDED
File without changes
|