File size: 1,489 Bytes
458faa7
 
 
 
 
 
 
 
 
 
ab67987
 
458faa7
ab67987
458faa7
 
 
 
 
 
ab67987
 
458faa7
 
8cdcb92
 
 
 
 
 
 
 
 
458faa7
 
8cdcb92
458faa7
 
8cdcb92
 
 
 
 
458faa7
 
8cdcb92
458faa7
8cdcb92
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
## Multilang ASR Captioner

A multilingual automatic speech recognition and video captioning CLI tool using faster whisper on cpu.

## Requirements and Instalations

To run this tool you will need the following sofware installed on your computer:
+ [ImageMagick](https://imagemagick.org/script/download.php)
+ [Python (3.11)](https://www.python.org/downloads/release/python-3116/)

Once you are at your desired working directory, run the following commands on your terminal:

```{bash}
git clone [email protected]:marquesafonso/multilang-asr-captioner.git

pip install pipenv

pipenv install
```

Note that this assumes a proper Git installation and ssh key configuration. 

## Quick start

### Command Line Interface

Run the following code to your example using the CLI. The example is based on a youtube video url (optional):

```
pipenv run python .\cli.py --invideo_filename '<your_file_name>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8
```

Fontsize, Font, Background Color and Text Color arguments are available:

```
pipenv run python .\cli.py --invideo_filename '<your_file>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8 --fontsize 28 --font "Arial-Bold" --bg_color None --text_color 'white'
```

### API

A FastAPI API is also made available.

To start the API run:

```
pipenv run uvicorn main:app --reload
```

Then check the [submit_video](http://127.0.0.1:8000/submit_video/) endpoint.