IndicTrans2 Sanskrit ↔ Nepali (LoRA Fine-tuned)

HF Model Hub

This repository provides a LoRA-based fine-tuning of the IndicTrans2 model for translation between Sanskrit and Nepali (both in Devanagari script).


Model Details

Note: In the training script, the source and target might be reversed (npi_Devasan_Deva). However, in the example usage code below, we demonstrate translating from Sanskrit to Nepali. Adjust src_lang and tgt_lang as needed for your specific use case.


Intended Use

  • Primary Use: Machine translation from Sanskrit to Nepali (and vice versa if you reverse the language codes).
  • Domains: General domain, though performance may vary depending on your dataset and domain specifics.

How to Use

Below is a minimal code snippet demonstrating how to use this LoRA-finetuned model for translation. The snippet uses:

Installation

Install Transformers (if not already installed):

 pip install transformers
 git clone https://github.com/VarunGumma/IndicTransToolkit
 cd IndicTransToolkit
 pip install --editable ./
 pip install torch
 pip install ipython jupyter
 
 import torch
 from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
 from IndicTransToolkit import IndicProcessor

Inference Example

  ip = IndicProcessor(inference=True)
  model_name = "karki-dennish/indictrans2-sanNpi"  # or your Hugging Face repo ID
  
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
  model = AutoModelForSeq2SeqLM.from_pretrained(model_name, trust_remote_code=True)
  
  def translate_san_to_npi(texts):
      # 1. Preprocess
      batch = ip.preprocess_batch(texts, src_lang="san_Deva", tgt_lang="npi_Deva")
      encoded = tokenizer(batch, padding=True, truncation=True, return_tensors="pt")

      # 2. Generate translations
      with torch.inference_mode():
          outputs = model.generate(**encoded, num_beams=5, max_length=256)
  
      # 3. Decode and Postprocess
      translations = tokenizer.batch_decode(outputs, skip_special_tokens=True)
      translations = ip.postprocess_batch(translations, lang="npi_Deva")
  
      return translations

  # Example usage
  sanskrit_sentences = [
      "अहं गच्छामि।",  # "I am going."
      "किं समाचारः?"    # "What news?"
  ]
  
  translated = translate_san_to_npi(sanskrit_sentences)
  for s, t in zip(sanskrit_sentences, translated):
      print(f"Sanskrit: {s} --> Nepali: {t}")

Citation

@misc{indictrans2-lora-san-npi,
  title         = {IndicTrans2 Sanskrit-Nepali LoRA Fine-tuned Model},
  author        = {Dennish Karki},
  howpublished  = {Hugging Face repository},
  year          = {2025},
  url           = {https://huggingface.co/karki-dennish/indictrans2-sanNpi}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for karki-dennish/indictrans2-sanNpi

Finetuned
(1)
this model