Vimax97's picture
Update README.md
f18e941 verified
metadata
base_model: unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - mllama
  - trl
license: apache-2.0
language:
  - en

Product Captioining Model

Given a product-image, this model can create accurate description about the image, describing the following criterions:

  • Surface where object is located
  • Surrounding objects
  • Background
  • Lighting
  • Overall mood

The model was trained with a custom dataset tailored for this usecase.

Examples

test Generated prompt: Professional photo of an object on a stone podium which is on a marble table, a wall in the background, a palm leaf in the corner, a harsh shadow from the left side, a concrete wall in the background, minimalist mood

test Generated prompt: Professional photo of an object on a wooden table, bokeh background, soft daylight

test Generated prompt: Professional photo of an object on a marble podium which is on a jungle clearing, surrounded by palm trees and lush greenery, a misty mountain range in the background, a cloudy sky

Uploaded model

  • Developed by: Vimax97
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit

This mllama model was trained 2x faster with Unsloth and Huggingface's TRL library.