Voice Activity Detection
PyTorch
pyannote
pyannote.audio
pyannote-audio-model
audio
voice
speech
speaker
speaker-diarization
speaker-change-detection
speaker-segmentation
overlapped-speech-detection
resegmentation
speaker-recognition
speaker-verification
speaker-identification
speaker-embedding
PyTorch
wespeaker
Upload folder using huggingface_hub
Browse files- README.md +1 -1
- config.yaml +18 -7
- pytorch_model.bin +2 -2
README.md
CHANGED
|
@@ -20,6 +20,6 @@ tags:
|
|
| 20 |
- speaker-embedding
|
| 21 |
- PyTorch
|
| 22 |
- wespeaker
|
| 23 |
-
licence:
|
| 24 |
---
|
| 25 |
This is the model card of a pyannote model that has been pushed on the Hub. This model card has been automatically generated.
|
|
|
|
| 20 |
- speaker-embedding
|
| 21 |
- PyTorch
|
| 22 |
- wespeaker
|
| 23 |
+
licence: mit
|
| 24 |
---
|
| 25 |
This is the model card of a pyannote model that has been pushed on the Hub. This model card has been automatically generated.
|
config.yaml
CHANGED
|
@@ -1,10 +1,21 @@
|
|
| 1 |
model:
|
| 2 |
-
_target_: pyannote.audio.models.
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
num_channels: 1
|
| 7 |
-
num_mel_bins: 80
|
| 8 |
sample_rate: 16000
|
| 9 |
-
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
model:
|
| 2 |
+
_target_: pyannote.audio.models.segmentation.PyanNet.PyanNet
|
| 3 |
+
linear:
|
| 4 |
+
hidden_size: 128
|
| 5 |
+
num_layers: 2
|
| 6 |
+
lstm:
|
| 7 |
+
batch_first: true
|
| 8 |
+
bidirectional: true
|
| 9 |
+
dropout: 0.0
|
| 10 |
+
hidden_size: 128
|
| 11 |
+
monolithic: true
|
| 12 |
+
num_layers: 4
|
| 13 |
num_channels: 1
|
|
|
|
| 14 |
sample_rate: 16000
|
| 15 |
+
sincnet:
|
| 16 |
+
sample_rate: 16000
|
| 17 |
+
stride: 10
|
| 18 |
+
task:
|
| 19 |
+
duration: 10.0
|
| 20 |
+
max_speakers_per_chunk: 3
|
| 21 |
+
max_speakers_per_frame: 2
|
pytorch_model.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cf667e302cb3ad72316803868e2cf007d35d506e4ac6daafdd527dfd69f3fa72
|
| 3 |
+
size 5912144
|