Fine-Tuned Model: Meta-Llama-3.1-8B-Instruct-bnb-4bit

This is a fine-tuned version of the Meta-Llama-3.1-8B-Instruct-bnb-4bit model, adapted for French multi-speaker diarization tasks. Below, you'll find details about the fine-tuning process, dataset, and how to use this model.

Model Details

Base Model: Meta-Llama-3.1-8B-Instruct-bnb-4bit
Quantization: 4-bit quantization for reduced memory usage
Purpose: Fine-tuned for multi-speaker diarization in French.
Techniques:
- LoRA (Low-Rank Adaptation) for efficient fine-tuning.
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj.
- Rank: 16
- LoRA alpha: 16
- Gradient checkpointing: Enabled.

Dataset

The model was fine-tuned on the French_MultiSpeaker_Diarization dataset, hosted on the Hugging Face Hub:

Dataset Name: French_MultiSpeaker_Diarization
Split Used: Train
Dataset Content:
- Multispeaker conversational data in French.
- Includes labeled diarization information to improve diarization capabilities.

Training Configuration

Hyperparameters

Max Sequence Length: 120,000
LoRA Dropout: 0
Bias: none
Use Gradient Checkpointing: Enabled for efficiency.
Custom Prompting: Chat templates applied for formatting prompts (e.g., llama-3.1 template).

Training Workflow

Model Loading:
- Loaded the base model using FastLanguageModel.from_pretrained().
- Applied 4-bit quantization for memory efficiency.
Dataset Preparation:
- The dataset was tokenized using a custom chat template from the unsloth.chat_templates library.
- Prompts formatted with apply_chat_template() to suit the diarization task.
Fine-Tuning:
- LoRA applied to specific layers for adaptation.
- Gradient checkpointing enabled to reduce memory overhead during training.

Usage

Load the Model

You can load this model directly from Hugging Face:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "olafdil/FrDiarization-Llama-3.1-8B-4bit"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)

Inference Example

template = """
I have an audio transcription where multiple speakers are involved in a conversation.
Your task is to distinguish the different speakers and diarize the text accordingly.
Each speaker's dialogue should be clearly labeled, such as 'Speaker 1:', 'Speaker 2:', etc.
Ensure that the labels remain consistent throughout the transcription and that the text is formatted neatly.
Here's the transcription:
"""
transciption = "Your input transcription here"
prompt = template + transcription

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Dependencies

The following libraries were used:

transformers
datasets
unsloth
torch

To install the dependencies, you can use:

pip install transformers datasets torch unsloth

Limitations

The model has been fine-tuned specifically for French multi-speaker diarization tasks and may not generalize well to other tasks or languages.
4-bit quantization reduces memory usage but may slightly affect precision.

Citation

If you use this model, please consider citing the base model and the dataset:

Base Model: Meta-Llama-3.1-8B-Instruct-bnb-4bit
Dataset: French MultiSpeaker Diarization

olafdil
/

FrDiarization-Llama-3.1-8B-4bit