arhanv commited on
Commit
f6a3d7e
1 Parent(s): cb93854

created file structure, added dataset

Browse files
.gitignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ /.env
2
+ ._*
3
+ /dataset/unzipped
app.py ADDED
File without changes
audio_utils.py ADDED
File without changes
dataset/README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Freesound One-Shot Percussive Sounds Dataset
2
+
3
+
4
+ This dataset contains 10254 one-shot (single event) percussive sounds from Freesound.org and the corresponding timbral analysis. These were used to train the generative model for "Neural Percussive Synthesis Parameterised by High-Level Timbral Features".
5
+
6
+ ## Dataset Construction
7
+
8
+ To collect this dataset, the following steps were performed:
9
+
10
+ * Freesound was queried with words associated with percussive instruments, such as "percussion", "kick", "wood" or "clave". Only sounds with less than one second of [effective duration](https://essentia.upf.edu/reference/std_EffectiveDuration.html) were selected.
11
+
12
+ * This stage retrieved some audio clips that contained multiple sound events or that were of low quality.
13
+ Therefore, we listened to all the retrieved sounds and manually discarded the sounds presenting one of these characteristics. For this, the [percussive-annotator](https://github.com/xavierfav/percussive-annotator) was used.
14
+
15
+ * The sounds were then cut or padded to have 1-second length, normalized and downsampled to 16kHz.
16
+
17
+ * Finally, the sounds were analyzed with the [AudioCommons Extractor](https://github.com/AudioCommons/ac-audio-extractor), to obtain the AudioCommons timbral descriptors. This information is contained in the 'analysis' folder.
18
+
19
+
20
+ ## Dataset Organisation
21
+
22
+ The dataset contains two folders and two files in the root directory:
23
+
24
+ * 'one_shot_percussive_sounds' encloses the pre-processed audio files. These are named '<freesound_sound_id>.wav'
25
+
26
+ * 'analysis' holds the AudioCommons analysis files for each of the sounds in the dataset. This analysis is stored as a .json file, named '<freesound_sound_id>_analysis.json', with a key for each of the features extracted.
27
+
28
+ * Two more files are present in the root directory of the dataset: this 'README' and the 'licenses.json'. The latter one is a '.json' file containing the name, the username of the uploader and the license for each of the sounds in the dataset.
29
+
30
+
31
+ ## Authors and Contact
32
+
33
+ This dataset was developed by Ant贸nio Ramires, Pritish Chadna, Xavier Favory, Emilia G贸mez and Xavier Serra.
34
+
35
+ Any questions related to this dataset please contact:
36
+
37
+ Ant贸nio Ramires
38
+
39
40
+
41
42
+
43
+
44
+ ## References
45
+
46
+ Please cite this paper if you use this dataset:
47
+
48
+ ```
49
+
50
+ @inproceedings{ramires2020,
51
+ author = "Antonio Ramires and Pritish Chandna and Xavier Favory and Emilia G贸mez and Xavier Serra",
52
+ title = "Neural Percussive Synthesis Parametrerised by High-Level Timbral Features",
53
+ booktitle = "Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP)",
54
+ year = "2020"
55
+
56
+ }
57
+
58
+ ```
59
+
60
+
61
+ ## Acknowledgements
62
+
63
+ This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk艂odowska-Curie grant agreement No. 765068 (MIP-Frontiers).
64
+
65
+ This work has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 770376 (TROMPA).
66
+
67
+ <img src="https://upload.wikimedia.org/wikipedia/commons/b/b7/Flag_of_Europe.svg" height="64" hspace="20">
68
+
69
+
dataset/analysis.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e27faf24d3650e9541fb2f76c0ec7bd2be79672583c45aa17bc2cb830cb50fd8
3
+ size 5610013
dataset/licenses.txt ADDED
The diff for this file is too large to render. See raw diff
 
dataset/one_shot_percussive_sounds.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c45401b3cbdd56606f0d9e5e494a18efbae1ca830f835504dccc316c1934720c
3
+ size 112614838
inference.py ADDED
File without changes