Image-Text-to-Text
Safetensors
mpt
custom_code
shubham-krutrim commited on
Commit
1cc7d08
·
verified ·
1 Parent(s): 3f4dd9b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Chitrarth: Bridging Vision and Language for a Billion People
2
+
3
+ ## 1. Introduction
4
+
5
+ Chitrarth (Chitra: Image; Artha: Meaning) is a multilingual VLM that integrates a state-of-the-art multilingual Large Language Model (LLM) with a vision module. This model is trained primarily on multilingual image-text data and is designed to work across 10 prominent Indian languages, including Hindi, Bengali, Telugu, Tamil, Marathi, Gujarati, Kannada, Malayalam, Odia, and Assamese, as well as English
6
+
7
+ ## 2. Model Summary
8
+
9
+ ### Key Features
10
+ - **Model:** Krutrim-1 as the base LLM, SigLIP as the visual encoder with 2 layer MLP
11
+ - **Languages Supported:** 10 Indic languages - Hindi, Bengali, Telugu, Tamil, Marathi, Gujarati, Kannada, Malayalam, Odia, and Assamese, as well as English
12
+ - **Usage:** General purpose VLM
13
+
14
+
15
+ ## 3. API Platform
16
+ Visit [Chitrarth Online](https://cloud.olakrutrim.com/console/inference-service?section=models&modelName=Krutrim&artifactName=chitrarth&artifactType=model) to access the model via the web interface.
17
+
18
+
19
+ ## 4. License
20
+
21
+ ## 5. Citation
22
+
23
+ ```
24
+ @inproceedings{
25
+ khan2024chitrarth,
26
+ title={Chitrarth: Bridging Vision and Language for a Billion People},
27
+ author={Shaharukh Khan, Ayush Tarun, Abhinav Ravi, Ali Faraz, Praveen Kumar Pokala, Anagha Bhangare, Raja Kolla, Chandra Khatri, Shubham Agarwal},
28
+ booktitle={NeurIPS Multimodal Algorithmic Reasoning},
29
+ year={2024},
30
+ }
31
+ ```
32
+
33
+ ## 6. Contact
34
+ Contributions are welcome! If you have any improvements or suggestions, feel free to submit a pull request on GitHub.