Model Card for Model ID
Model Details
Model Description
This is the model card of a π€ transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: [More Information Needed]
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: [More Information Needed]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
π How to Use This Model for Inference
This model is fine-tuned using LoRA (PEFT) on Phi-4 (4-bit Unsloth). To use it, you need to:
- Load the base model
- Load the LoRA adapter
- Run inference
π Install Required Libraries
Before running the code, make sure you have the necessary dependencies installed:
pip install unsloth peft transformers torch
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
# Load the base model
base_model_name = "unsloth/Phi-4-unsloth-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=base_model_name,
max_seq_length=4096, # Must match fine-tuning
load_in_4bit=True,
)
# Load the fine-tuned LoRA adapter
lora_model_name = "Machlovi/Phi4_MedQA_USMLE_4_Options"
model = PeftModel.from_pretrained(model, lora_model_name)
# Run inference
input_text = "What are the symptoms of a heart attack?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=100)
# Decode and print response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
π‘ Notes
- This model is quantized in 4-bit for efficiency.
- Ensure
max_seq_length
matches the training configuration. - This model requires a GPU (CUDA) for inference.
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Training Procedure
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
medmcqa = """< |im_start| > system
You are a medical doctor answering real-world medical entrance exam questions.
Based on your understanding of basic and clinical science, medical knowledge, and mechanisms underlying health, disease,
patient care, and modes of therapy, answer the following multiplechoice question.
Select one correct answer from A to D. Base your answer on the current and standard practices referenced in medical guidelines. < |im_end| >
< |im_start| >
question: {}
options:{}
< |im_end| >
< |im_start| >
answer:{}
π Example Inference
< |im_start| > system
You are a medical doctor answering real-world medical entrance exam questions.
Based on your understanding of basic and clinical science, medical knowledge, and mechanisms underlying health, disease,
patient care, and modes of therapy, answer the following multiple-choice question.
Select one correct answer from A to D. Base your answer on the current and standard practices referenced in medical guidelines.
< |im_end| >
< |im_start| >
question: A junior orthopaedic surgery resident is completing a carpal tunnel repair with the department chairman as the attending physician.
During the case, the resident inadvertently cuts a flexor tendon. The tendon is repaired without complication.
The attending tells the resident that the patient will do fine, and there is no need to report this minor complication that will not harm the patient,
as he does not want to make the patient worry unnecessarily. He tells the resident to leave this complication out of the operative report.
Which of the following is the correct next action for the resident to take?
options:
A. Disclose the error to the patient and put it in the operative report
B. Tell the attending that he cannot fail to disclose this mistake
C. Report the physician to the ethics committee
D. Refuse to dictate the operative report
< |im_end| >
< |im_start| >
answer: B. Tell the attending that he cannot fail to disclose this mistake
< |im_end| >
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]