File size: 1,489 Bytes
458faa7 ab67987 458faa7 ab67987 458faa7 ab67987 458faa7 8cdcb92 458faa7 8cdcb92 458faa7 8cdcb92 458faa7 8cdcb92 458faa7 8cdcb92 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
## Multilang ASR Captioner
A multilingual automatic speech recognition and video captioning CLI tool using faster whisper on cpu.
## Requirements and Instalations
To run this tool you will need the following sofware installed on your computer:
+ [ImageMagick](https://imagemagick.org/script/download.php)
+ [Python (3.11)](https://www.python.org/downloads/release/python-3116/)
Once you are at your desired working directory, run the following commands on your terminal:
```{bash}
git clone [email protected]:marquesafonso/multilang-asr-captioner.git
pip install pipenv
pipenv install
```
Note that this assumes a proper Git installation and ssh key configuration.
## Quick start
### Command Line Interface
Run the following code to your example using the CLI. The example is based on a youtube video url (optional):
```
pipenv run python .\cli.py --invideo_filename '<your_file_name>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8
```
Fontsize, Font, Background Color and Text Color arguments are available:
```
pipenv run python .\cli.py --invideo_filename '<your_file>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8 --fontsize 28 --font "Arial-Bold" --bg_color None --text_color 'white'
```
### API
A FastAPI API is also made available.
To start the API run:
```
pipenv run uvicorn main:app --reload
```
Then check the [submit_video](http://127.0.0.1:8000/submit_video/) endpoint.
|