krutrim-ai-labs
/

Chitrarth

Image-Text-to-Text

Model card Files Files and versions Community

shubham-krutrim commited on Feb 3

Commit

1cc7d08

·

verified ·

1 Parent(s): 3f4dd9b

Create README.md

Files changed (1) hide show

README.md +34 -0

README.md ADDED Viewed

	@@ -0,0 +1,34 @@

+# Chitrarth: Bridging Vision and Language for a Billion People
+## 1. Introduction
+Chitrarth (Chitra: Image; Artha: Meaning) is a multilingual VLM that integrates a state-of-the-art multilingual Large Language Model (LLM) with a vision module. This model is trained primarily on multilingual image-text data and is designed to work across 10 prominent Indian languages, including Hindi, Bengali, Telugu, Tamil, Marathi, Gujarati, Kannada, Malayalam, Odia, and Assamese, as well as English
+## 2. Model Summary
+### Key Features
+- **Model:** Krutrim-1 as the base LLM, SigLIP as the visual encoder with 2 layer MLP
+- **Languages Supported:** 10 Indic languages - Hindi, Bengali, Telugu, Tamil, Marathi, Gujarati, Kannada, Malayalam, Odia, and Assamese, as well as English
+- **Usage:** General purpose VLM
+## 3. API Platform
+Visit [Chitrarth Online](https://cloud.olakrutrim.com/console/inference-service?section=models&modelName=Krutrim&artifactName=chitrarth&artifactType=model) to access the model via the web interface.
+## 4. License
+## 5. Citation
+```
+@inproceedings{
+  khan2024chitrarth,
+  title={Chitrarth: Bridging Vision and Language for a Billion People},
+  author={Shaharukh Khan, Ayush Tarun, Abhinav Ravi, Ali Faraz, Praveen Kumar Pokala, Anagha Bhangare, Raja Kolla, Chandra Khatri, Shubham Agarwal},
+  booktitle={NeurIPS Multimodal Algorithmic Reasoning},
+  year={2024},
+}
+```
+## 6. Contact
+Contributions are welcome! If you have any improvements or suggestions, feel free to submit a pull request on GitHub.