Update README.md
Browse files
README.md
CHANGED
|
@@ -27,27 +27,20 @@ Notable differences from other available models include:
|
|
| 27 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
| 28 |
|
| 29 |
### Model Sources
|
| 30 |
-
- **
|
| 31 |
-
- **Repository:** https://github.com/jimbozhang/hf_transformers_custom_model_ced
|
| 32 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
| 33 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
| 34 |
|
| 35 |
-
## Install
|
| 36 |
-
```bash
|
| 37 |
-
pip install git+https://github.com/jimbozhang/hf_transformers_custom_model_ced.git
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
## Inference
|
| 41 |
```python
|
| 42 |
-
>>> from
|
| 43 |
-
>>> from ced_model.modeling_ced import CedForAudioClassification
|
| 44 |
|
| 45 |
>>> model_name = "mispeech/ced-base"
|
| 46 |
-
>>> feature_extractor =
|
| 47 |
-
>>> model =
|
| 48 |
|
| 49 |
>>> import torchaudio
|
| 50 |
-
>>> audio, sampling_rate = torchaudio.load("
|
| 51 |
>>> assert sampling_rate == 16000
|
| 52 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
| 53 |
|
|
|
|
| 27 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
| 28 |
|
| 29 |
### Model Sources
|
| 30 |
+
- **Repository:** https://github.com/RicherMans/CED
|
|
|
|
| 31 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
| 32 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
## Inference
|
| 35 |
```python
|
| 36 |
+
>>> from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
|
|
|
|
| 37 |
|
| 38 |
>>> model_name = "mispeech/ced-base"
|
| 39 |
+
>>> feature_extractor = AutoFeatureExtractor.from_pretrained(model_name, trust_remote_code=True)
|
| 40 |
+
>>> model = AutoModelForAudioClassification.from_pretrained(model_name, trust_remote_code=True)
|
| 41 |
|
| 42 |
>>> import torchaudio
|
| 43 |
+
>>> audio, sampling_rate = torchaudio.load("/path-to/JeD5V5aaaoI_931_932.wav")
|
| 44 |
>>> assert sampling_rate == 16000
|
| 45 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
| 46 |
|