Here's the revised documentation for your fine-tuned DistilBERT model:
distilbert/distilbert-base-uncased
Fine-Tuned on SQuAD
distilbert_squad
Pretrained model on context-based Question Answering using the SQuAD dataset. This model is fine-tuned from the DistilBERT architecture for extracting answers from passages.
Model Description
distilbert_squad
is a lightweight transformer-based model fine-tuned for context-based question-answering tasks. It adapts the pretrained DistilBERT architecture to extract precise answers from passages. This model was trained and fine-tuned on the Stanford Question Answering Dataset (SQuAD), leveraging its efficiency for resource-constrained environments.
Fine-tuned by: SADAT PARVEJ
Shared by: SADAT PARVEJ
Language(s) (NLP): ENGLISH
Finetuned from model: distilbert/distilbert-base-uncased
Training Objective
The model predicts the most relevant span of text in a given passage that answers a specific question. It fine-tunes DistilBERT's ability to analyze context using supervised data from SQuAD.
Performance Benchmarks
Training Loss: 0.464500
Validation Loss: 0.504376
Exact Match (EM): 85.59%
F1 Score: 85.59%
Intended Uses & Limitations
This model is designed for tasks such as:
- Extractive Question Answering
- Reading comprehension applications
Known Limitations:
- As DistilBERT is distilled from BERT, its smaller architecture may limit its performance compared to the original BERT model on certain tasks.
- The model's predictions may be biased or overly reliant on the training dataset, as SQuAD comprises structured and fact-based question-answer pairs.
How to Get Started with the Model
Use the code below to get started with the model:
import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
# Load the model and tokenizer
model_name = "YourHuggingFaceModelPath/distilbert_squad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
context = """
Thomas Edison is credited with inventing the light bulb. He was born in 1847 and was a prolific inventor.
"""
question = "Who invented the light bulb?"
inputs = tokenizer(question, context, return_tensors="pt", truncation=True, max_length=512)
input_ids = inputs["input_ids"].to(device)
attention_mask = inputs["attention_mask"].to(device)
# Perform inference
with torch.no_grad():
outputs = model(input_ids=input_ids, attention_mask=attention_mask)
start_scores = outputs.start_logits
end_scores = outputs.end_logits
# Get start and end indices
start_idx = torch.argmax(start_scores)
end_idx = torch.argmax(end_scores) + 1
# Decode the answer
if start_idx >= end_idx:
print("Model did not predict a valid answer. Please check context and question.")
else:
answer = tokenizer.convert_tokens_to_string(
tokenizer.convert_ids_to_tokens(input_ids[0][start_idx:end_idx])
)
print(f"Question: {question}")
print(f"Answer: {answer}")
Training Details
Training Data
The model was fine-tuned using the SQuAD dataset, a benchmark for context-based question-answering tasks. It contains Wikipedia passages and human-annotated questions with answers.
Training Procedure
Training Objective
The model was trained to extract answers to specific questions based on a provided passage, leveraging DistilBERT's reduced architecture for faster inference and training.
Optimization
- Optimizer: AdamW
- Learning Rate Scheduler: Linear scheduler with warm-up steps
- Steps: 2000 training steps with early stopping
Hardware and Resources
- GPU: Tesla P100 or NVIDIA T4
- Framework: Hugging Face Transformers
Training Metrics
The model achieved the following metrics during training and evaluation:
Step | Training Loss | Validation Loss | Exact Match | Squad F1 | Start Accuracy | End Accuracy |
---|---|---|---|---|---|---|
100 | 0.719900 | 0.941330 | 84.66% | 84.66% | 84.66% | 89.92% |
500 | 0.640500 | 0.555793 | 84.87% | 84.87% | 84.87% | 89.92% |
1000 | 0.413100 | 0.551416 | 84.93% | 84.93% | 84.93% | 89.92% |
1500 | 0.522600 | 0.518057 | 85.17% | 85.17% | 85.17% | 89.92% |
2000 | 0.464500 | 0.504376 | 85.59% | 85.59% | 85.59% | 89.92% |
Results Summary
The fine-tuned DistilBERT model achieved:
- Exact Match (EM): 85.59%
- F1 Score: 85.59%
- Validation Loss: 0.504376
This highlights DistilBERT’s efficiency and accuracy for context-based question-answering tasks when fine-tuned on SQuAD.
BibTeX
@misc{distilbert_squad_finetune,
title = {DistilBERT Fine-tuned for SQuAD},
author = {Sadat Parvej},
year = {2024},
url = {https://huggingface.co/your-model-repository}
}
Let me know if you need further adjustments or additions!
- Downloads last month
- 5