README.md · marquesafonso/multilang-asr-captioner at main

metadata

title: Multilang Asr Captioner
sdk: docker
emoji: 📚
colorFrom: red
colorTo: blue
app_file: main.py
app_port: 8000
pinned: true
license: cc-by-nc-4.0
short_description: Multilingual ASR and video captioning tool

Multilang ASR Captioner

A multilingual automatic speech recognition and video captioning tool using faster whisper.

Supports real-time translation to english. Runs on consumer grade cpu.

Requirements and Instalations

Docker (preferred)

You'll need to install docker.

Then, follow the steps below.

clone the repo

git clone [email protected]:marquesafonso/multilang-asr-captioner.git

Build and run the container using docker-compose

docker compose up

Check the landing page.

From there you will see the submit_video endpoint and the documentation

Tip: on Linux or Mac localhost will resolve directly to 0.0.0.0 but on windows you will need to change it to 127.0.0.1 or localhost

Local

To run this tool locally on your computer you will need the following sofware installed:

Once you are at your desired working directory, run the following commands on your terminal:

git clone [email protected]:marquesafonso/multilang-asr-captioner.git

pip install pipenv

pipenv install

Note that this assumes a proper Git installation and ssh key configuration.

Quick start (local)

API

A FastAPI API is available. To start the API locally, run:

pipenv run python main.py

Then check the landing page.

From there you will see the submit_video endpoint and the documentation

Tip: on Linux or Mac localhost will resolve directly to 0.0.0.0 but on windows you will need to change it to 127.0.0.1 or localhost