falche
/

WhisperWithJPDiarization

Model card Files Files and versions Community

NekoMikoReimu commited on Sep 16, 2024

Commit

7fdb765

·

verified ·

1 Parent(s): 3988f9a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ Can be given a video file or mp3/wav file.
 Performance is considerably better than default JP whisper for most tasks involving Japanese content, with the exception of singing/karaoke (Where performance is below the original due to the training dataset.)
-Requires ffmpeg, openai-whisper, pyannote and facebookresearch's demux model. Torch is also strongly encouraged.
 Pyannote requies a Huggingface API key, which it will currently look for under the environment variable "HF_TOKEN_NOT_LOGIN" (At the time of this writing, naming your HF token "HF_TOKEN" causes bugs.)
 Originally intended as a solo project, but I'm upping it here in the hopes it will be useful to practicioners. If you're doing work in this space please feel free to reach out.

 Performance is considerably better than default JP whisper for most tasks involving Japanese content, with the exception of singing/karaoke (Where performance is below the original due to the training dataset.)
+Requires ffmpeg, openai-whisper, pyannote and facebookresearch's demux model. Cuda is also strongly encouraged.
 Pyannote requies a Huggingface API key, which it will currently look for under the environment variable "HF_TOKEN_NOT_LOGIN" (At the time of this writing, naming your HF token "HF_TOKEN" causes bugs.)
 Originally intended as a solo project, but I'm upping it here in the hopes it will be useful to practicioners. If you're doing work in this space please feel free to reach out.