Daemontatox
/

DocumentCogito

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions Community

Daemontatox commited on Jan 16

Commit

23dcfc6

·

verified ·

1 Parent(s): d253f9e

Update README.md

Files changed (1) hide show

README.md +36 -6

README.md CHANGED Viewed

@@ -10,12 +10,42 @@ language:
 - en
 ---
-# Uploaded finetuned  model
-- **Developed by:** Daemontatox
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/Llama-3.2-11B-Vision-Instruct
-This mllama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - en
 ---
+# **unsloth/Llama-3.2-11B-Vision-Instruct (Fine-Tuned)**
+## **Model Overview**
+This model, fine-tuned from the `unsloth/Llama-3.2-11B-Vision-Instruct` base, is optimized for vision-language tasks with enhanced instruction-following capabilities. Fine-tuning was completed 2x faster using the [Unsloth](https://github.com/unslothai/unsloth) framework combined with Hugging Face's TRL library, ensuring efficient training while maintaining high performance.
+## **Key Information**
+- **Developed by:** Daemontatox
+- **Base Model:** `unsloth/Llama-3.2-11B-Vision-Instruct`
+- **License:** Apache-2.0
+- **Language:** English (`en`)
+- **Frameworks Used:** Hugging Face Transformers, Unsloth, and TRL
+## **Performance and Use Cases**
+This model is ideal for applications involving:
+- Vision-based text generation and description tasks
+- Instruction-following in multimodal contexts
+- General-purpose text generation with enhanced reasoning
+### **Features**
+- **2x Faster Training:** Leveraging the Unsloth framework for accelerated fine-tuning.
+- **Multimodal Capabilities:** Enhanced to handle vision-language interactions.
+- **Instruction Optimization:** Tailored for improved comprehension and execution of instructions.
+## **How to Use**
+### **Inference Example (Hugging Face Transformers)**
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("Daemontatox/finetuned-llama-3.2-vision-instruct")
+model = AutoModelForCausalLM.from_pretrained("Daemontatox/finetuned-llama-3.2-vision-instruct")
+input_text = "Describe the image showing a sunset over mountains."
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=100)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))