LLaMA 3.2 Vision Fine-Tuned on OCR Handwriting Dataset

This repository contains the LLaMA 3.2 Vision Fine-Tuned on OCR Handwriting Dataset model, a specialized version of the unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit model. It has been specifically adapted for Optical Character Recognition (OCR) tasks focused on handwritten text. The fine-tuning process was conducted using the DataStudio/OCR_handwritting_HAT2023 dataset, leveraging the Unsloth library for efficient training and inference.

Model Overview

  • Base Model: unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
  • Task: Optical Character Recognition (OCR) on handwritten text

Key Features

  • Fine-Tuning Dataset: DataStudio/OCR_handwritting_HAT2023
  • Fine-Tuning Method: LoRA (Low-Rank Adaptation)
  • Optimization: Utilizes the Unsloth library to enhance training and inference efficiency

Model Card Metadata

  • Base Model: unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
  • Tags: text-generation-inference, transformers, unsloth, mllama
  • License: Apache-2.0
  • Language: English

Fine-Tuned Model Details

Additional Resources

For further details regarding the fine-tuning process, the Unsloth library, and the LoRA method, please consult the following resource:

License

This project is distributed under the Apache-2.0 License.

Downloads last month
49
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.