Fedir Zadniprovskyi commited on
Commit
40dcc49
·
1 Parent(s): 8fc9285

chore: format README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -7
README.md CHANGED
@@ -1,6 +1,8 @@
1
  # Faster Whisper Server
 
2
  `faster-whisper-server` is an OpenAI API-compatible transcription server which uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) as its backend.
3
  Features:
 
4
  - GPU and CPU support.
5
  - Easily deployable using Docker.
6
  - **Configurable through environment variables (see [config.py](./src/faster_whisper_server/config.py))**.
@@ -12,20 +14,24 @@ Features:
12
  Please create an issue if you find a bug, have a question, or a feature suggestion.
13
 
14
  ## OpenAI API Compatibility ++
 
15
  See [OpenAI API reference](https://platform.openai.com/docs/api-reference/audio) for more information.
 
16
  - Audio file transcription via `POST /v1/audio/transcriptions` endpoint.
17
- - Unlike OpenAI's API, `faster-whisper-server` also supports streaming transcriptions (and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed, rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.
18
  - Audio file translation via `POST /v1/audio/translations` endpoint.
19
- - Live audio transcription via `WS /v1/audio/transcriptions` endpoint.
20
- - LocalAgreement2 ([paper](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) | [original implementation](https://github.com/ufal/whisper_streaming)) algorithm is used for live transcription.
21
- - Only transcription of a single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.
22
 
23
  ## Quick Start
 
24
  [Hugging Face Space](https://huggingface.co/spaces/Iatalking/fast-whisper-server)
25
 
26
  ![image](https://github.com/fedirz/faster-whisper-server/assets/76551385/6d215c52-ded5-41d2-89a5-03a6fd113aa0)
27
 
28
- Using Docker Compose (Recommended)
 
29
  NOTE: I'm using newer Docker Compsose features. If you are using an older version of Docker Compose, you may need need to update.
30
 
31
  ```bash
@@ -39,7 +45,8 @@ curl --silent --remote-name https://raw.githubusercontent.com/fedirz/faster-whis
39
  docker compose --file compose.cpu.yaml up --detach
40
  ```
41
 
42
- Using Docker
 
43
  ```bash
44
  # for GPU support
45
  docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --detach fedirz/faster-whisper-server:latest-cuda
@@ -47,22 +54,29 @@ docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.c
47
  docker run --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --env WHISPER__MODEL=Systran/faster-whisper-small --detach fedirz/faster-whisper-server:latest-cpu
48
  ```
49
 
50
- Using Kubernetes: [tutorial](https://substratus.ai/blog/deploying-faster-whisper-on-k8s)
 
 
51
 
52
  ## Usage
 
53
  If you are looking for a step-by-step walkthrough, check out [this](https://www.youtube.com/watch?app=desktop&v=vSN-oAl6LVs) YouTube video.
54
 
55
  ### OpenAI API CLI
 
56
  ```bash
57
  export OPENAI_API_KEY="cant-be-empty"
58
  export OPENAI_BASE_URL=http://localhost:8000/v1/
59
  ```
 
60
  ```bash
61
  openai api audio.transcriptions.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format text
62
 
63
  openai api audio.translations.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format verbose_json
64
  ```
 
65
  ### OpenAI API Python SDK
 
66
  ```python
67
  from openai import OpenAI
68
 
@@ -76,6 +90,7 @@ print(transcript.text)
76
  ```
77
 
78
  ### cURL
 
79
  ```bash
80
  # If `model` isn't specified, the default model is used
81
  curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]"
@@ -89,12 +104,14 @@ curl http://localhost:8000/v1/audio/translations -F "[email protected]"
89
  ```
90
 
91
  ### Live Transcription (using WebSocket)
 
92
  From [live-audio](./examples/live-audio) example
93
 
94
  https://github.com/fedirz/faster-whisper-server/assets/76551385/e334c124-af61-41d4-839c-874be150598f
95
 
96
  [websocat](https://github.com/vi/websocat?tab=readme-ov-file#installation) installation is required.
97
  Live transcription of audio data from a microphone.
 
98
  ```bash
99
  ffmpeg -loglevel quiet -f alsa -i default -ac 1 -ar 16000 -f s16le - | websocat --binary ws://localhost:8000/v1/audio/transcriptions
100
  ```
 
1
  # Faster Whisper Server
2
+
3
  `faster-whisper-server` is an OpenAI API-compatible transcription server which uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) as its backend.
4
  Features:
5
+
6
  - GPU and CPU support.
7
  - Easily deployable using Docker.
8
  - **Configurable through environment variables (see [config.py](./src/faster_whisper_server/config.py))**.
 
14
  Please create an issue if you find a bug, have a question, or a feature suggestion.
15
 
16
  ## OpenAI API Compatibility ++
17
+
18
  See [OpenAI API reference](https://platform.openai.com/docs/api-reference/audio) for more information.
19
+
20
  - Audio file transcription via `POST /v1/audio/transcriptions` endpoint.
21
+ - Unlike OpenAI's API, `faster-whisper-server` also supports streaming transcriptions (and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed, rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.
22
  - Audio file translation via `POST /v1/audio/translations` endpoint.
23
+ - Live audio transcription via `WS /v1/audio/transcriptions` endpoint.
24
+ - LocalAgreement2 ([paper](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) | [original implementation](https://github.com/ufal/whisper_streaming)) algorithm is used for live transcription.
25
+ - Only transcription of a single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.
26
 
27
  ## Quick Start
28
+
29
  [Hugging Face Space](https://huggingface.co/spaces/Iatalking/fast-whisper-server)
30
 
31
  ![image](https://github.com/fedirz/faster-whisper-server/assets/76551385/6d215c52-ded5-41d2-89a5-03a6fd113aa0)
32
 
33
+ ### Using Docker Compose (Recommended)
34
+
35
  NOTE: I'm using newer Docker Compsose features. If you are using an older version of Docker Compose, you may need need to update.
36
 
37
  ```bash
 
45
  docker compose --file compose.cpu.yaml up --detach
46
  ```
47
 
48
+ ### Using Docker
49
+
50
  ```bash
51
  # for GPU support
52
  docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --detach fedirz/faster-whisper-server:latest-cuda
 
54
  docker run --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --env WHISPER__MODEL=Systran/faster-whisper-small --detach fedirz/faster-whisper-server:latest-cpu
55
  ```
56
 
57
+ ### Using Kubernetes
58
+
59
+ Follow [this tutorial](https://substratus.ai/blog/deploying-faster-whisper-on-k8s)
60
 
61
  ## Usage
62
+
63
  If you are looking for a step-by-step walkthrough, check out [this](https://www.youtube.com/watch?app=desktop&v=vSN-oAl6LVs) YouTube video.
64
 
65
  ### OpenAI API CLI
66
+
67
  ```bash
68
  export OPENAI_API_KEY="cant-be-empty"
69
  export OPENAI_BASE_URL=http://localhost:8000/v1/
70
  ```
71
+
72
  ```bash
73
  openai api audio.transcriptions.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format text
74
 
75
  openai api audio.translations.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format verbose_json
76
  ```
77
+
78
  ### OpenAI API Python SDK
79
+
80
  ```python
81
  from openai import OpenAI
82
 
 
90
  ```
91
 
92
  ### cURL
93
+
94
  ```bash
95
  # If `model` isn't specified, the default model is used
96
  curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]"
 
104
  ```
105
 
106
  ### Live Transcription (using WebSocket)
107
+
108
  From [live-audio](./examples/live-audio) example
109
 
110
  https://github.com/fedirz/faster-whisper-server/assets/76551385/e334c124-af61-41d4-839c-874be150598f
111
 
112
  [websocat](https://github.com/vi/websocat?tab=readme-ov-file#installation) installation is required.
113
  Live transcription of audio data from a microphone.
114
+
115
  ```bash
116
  ffmpeg -loglevel quiet -f alsa -i default -ac 1 -ar 16000 -f s16le - | websocat --binary ws://localhost:8000/v1/audio/transcriptions
117
  ```