GemmaECG-Vision / README.md
yasserrmd's picture
Update README.md
6fd28d4 verified
metadata
base_model:
  - google/gemma-3n-E2B-it
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gemma3n
  - medical
  - vision-language
  - gemma
  - ecg
  - cardiology
  - healthcare
license: cc-by-4.0
datasets:
  - yasserrmd/pulse-ecg-instruct-subset
language:
  - en

GemmaECG-Vision

GemmaECG-Vision is a fine-tuned vision-language model built on google/gemma-3n-e2b, designed for ECG image interpretation tasks. The model accepts a medical ECG image along with a clinical instruction prompt and generates a structured analysis suitable for triage or documentation use cases.

This model was developed using Unsloth for efficient fine-tuning and supports image + text inputs with medical task-specific prompt formatting. It is designed to run in offline or edge environments, enabling healthcare triage in resource-constrained settings.

Model Objective

To assist healthcare professionals and emergency responders by providing AI-generated ECG analysis directly from medical images, without requiring internet access or cloud resources.

Usage

This model expects:

  • An ECG image (PIL.Image)
  • A textual instruction such as:

You are a clinical assistant specialized in ECG interpretation. Given an ECG image, generate a concise, structured, and medically accurate report.

Use this exact format:

Rhythm:
PR Interval:
QRS Duration:
Axis:
Bundle Branch Blocks:
Atrial Abnormalities:
Ventricular Hypertrophy:
Q Wave or QS Complexes:
T Wave Abnormalities:
ST Segment Changes:
Final Impression:

Inference Example (Python)

from transformers import AutoProcessor, Gemma3nForConditionalGeneration
from PIL import Image
import torch

model_id = "yasserrmd/GemmaECG-Vision"
model = Gemma3nForConditionalGeneration.from_pretrained(model_id, torch_dtype=torch.bfloat16).eval().to("cuda")
processor = AutoProcessor.from_pretrained(model_id)

image = Image.open("example_ecg.png").convert("RGB")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "Interpret this ECG and provide a structured triage report."}
        ]
    }
]

prompt = processor.apply_chat_template(messages, add_generation_prompt=True)

inputs = processor(image, prompt, return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=1.0,
    top_p=0.95,
    top_k=64,
    use_cache=True
)

result = processor.decode(outputs[0], skip_special_tokens=True)
print(result)

Training Details

  • Framework: Unsloth + TRL SFTTrainer
  • Hardware: Google Colab Pro (L4)
  • Batch Size: 2
  • Epochs: 1
  • Learning Rate: 2e-4
  • Scheduler: Cosine
  • Loss: CrossEntropy
  • Precision: bfloat16

Dataset

The training dataset is a curated subset of the PULSE-ECG/ECGInstruct dataset, reformatted for VLM instruction tuning.

  • 3,272 samples of ECG image + structured instruction + clinical output
  • Focused on realistic and medically relevant triage cases

Dataset link: yasserrmd/pulse-ecg-instruct-subset

Training Loss Summary

The model was fine-tuned over 409 steps using the pulse-ecg-instruct-subset dataset. The training loss started above 9.5 and steadily declined to below 0.5, showing consistent convergence and learning throughout the single epoch. The loss curve demonstrates a stable optimization process without overfitting spikes. The chart below visualizes this progression, highlighting the model’s ability to adapt quickly to the ECG image-to-text task.

Intended Use

  • Emergency triage in offline settings
  • On-device ECG assessment
  • Integration with medical edge devices (Jetson, Pi, Android)
  • Rapid analysis during disaster response

Limitations

  • Not intended to replace licensed medical professionals
  • Accuracy may vary depending on image quality
  • Model outputs should be reviewed by a clinician before action

License

This model is licensed under CC BY 4.0. You are free to use, modify, and distribute it with attribution.

Author

Mohamed Yasser Hugging Face Profile