marquesafonso's picture
add font and text color args. fix api behaviour. improve readme.
8cdcb92
|
raw
history blame
1.49 kB

Multilang ASR Captioner

A multilingual automatic speech recognition and video captioning CLI tool using faster whisper on cpu.

Requirements and Instalations

To run this tool you will need the following sofware installed on your computer:

Once you are at your desired working directory, run the following commands on your terminal:

git clone [email protected]:marquesafonso/multilang-asr-captioner.git

pip install pipenv

pipenv install

Note that this assumes a proper Git installation and ssh key configuration.

Quick start

Command Line Interface

Run the following code to your example using the CLI. The example is based on a youtube video url (optional):

pipenv run python .\cli.py --invideo_filename '<your_file_name>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8

Fontsize, Font, Background Color and Text Color arguments are available:

pipenv run python .\cli.py --invideo_filename '<your_file>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8 --fontsize 28 --font "Arial-Bold" --bg_color None --text_color 'white'

API

A FastAPI API is also made available.

To start the API run:

pipenv run uvicorn main:app --reload

Then check the submit_video endpoint.