Mdkaif2782
/

banglish-to-bangla

Text2Text Generation

Model card Files Files and versions Community

banglish-to-bangla / README.md

Mdkaif2782's picture

Update README.md

2c76514 verified about 1 month ago

|

history blame contribute delete

2.86 kB

	---
	datasets:
	- SKNahin/bengali-transliteration-data
	language:
	- bn
	- en
	base_model:
	- facebook/mbart-large-50
	tags:
	- banglish
	- bangla
	- translator
	- avro
	pipeline_tag: text2text-generation
	---

	# Hugging Face: Banglish to Bangla Translation

	This repository demonstrates how to use a Hugging Face model to translate Banglish (Romanized Bangla) text into Bangla using the MBart50 tokenizer and model. The model, `Mdkaif2782/banglish-to-bangla`, is pre-trained and fine-tuned for this task.

	## Setup in Google Colab
	Follow these steps to use the model in Google Colab:

	### 1. Install Dependencies
	Make sure you have the `transformers` library installed. Run the following command in your Colab notebook:

	```python
	!pip install transformers torch
	```

	### 2. Load and Use the Model
	Copy the code below into a cell in your Colab notebook to start translating Banglish to Bangla:

	```python
	from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
	import torch

	# Load the pre-trained model and tokenizer directly from Hugging Face
	model_name = "Mdkaif2782/banglish-to-bangla"
	tokenizer = MBart50TokenizerFast.from_pretrained(model_name)
	model = MBartForConditionalGeneration.from_pretrained(model_name)

	def translate_banglish_to_bangla(model, tokenizer, banglish_input):
	inputs = tokenizer(banglish_input, return_tensors="pt", padding=True, truncation=True, max_length=128)

	if torch.cuda.is_available():
	inputs = {key: value.cuda() for key, value in inputs.items()}
	model = model.cuda()

	translated_tokens = model.generate(**inputs, decoder_start_token_id=tokenizer.lang_code_to_id["bn_IN"])
	translated_text = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]

	return translated_text

	# Take custom input
	print("Enter your Banglish text (type 'exit' to quit):")
	while True:
	banglish_text = input("Banglish: ")
	if banglish_text.lower() == "exit":
	break

	# Translate Banglish to Bangla
	translated_text = translate_banglish_to_bangla(model, tokenizer, banglish_text)
	print(f"Translated Bangla: {translated_text}\n")
	```

	### 3. Run the Notebook
	1. Paste the above code into a cell.
	2. Run the cell.
	3. Enter your Banglish text in the input prompt to get the translated Bangla text. Type `exit` to quit.

	## Example Usage

	Input:
	```
	Banglish: amar valo lagche onek
	```

	Output:
	```
	Translated Bangla: আমার ভালো লাগছে অনেক
	```

	## Notes
	- Ensure your runtime in Google Colab supports GPU for faster processing. Go to `Runtime > Change runtime type` and select `GPU`.
	- The model `Mdkaif2782/banglish-to-bangla` can be fine-tuned further if required.

	## License
	This project uses the Hugging Face `transformers` library. Refer to the [Hugging Face documentation](https://huggingface.co/docs/transformers/) for more details.