YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Here's the revised documentation for your fine-tuned DistilBERT model:


distilbert/distilbert-base-uncased Fine-Tuned on SQuAD

distilbert_squad

Pretrained model on context-based Question Answering using the SQuAD dataset. This model is fine-tuned from the DistilBERT architecture for extracting answers from passages.


Model Description

distilbert_squad is a lightweight transformer-based model fine-tuned for context-based question-answering tasks. It adapts the pretrained DistilBERT architecture to extract precise answers from passages. This model was trained and fine-tuned on the Stanford Question Answering Dataset (SQuAD), leveraging its efficiency for resource-constrained environments.

Fine-tuned by: SADAT PARVEJ
Shared by: SADAT PARVEJ

Language(s) (NLP): ENGLISH
Finetuned from model: distilbert/distilbert-base-uncased


Training Objective

The model predicts the most relevant span of text in a given passage that answers a specific question. It fine-tunes DistilBERT's ability to analyze context using supervised data from SQuAD.


Performance Benchmarks

Training Loss: 0.464500
Validation Loss: 0.504376
Exact Match (EM): 85.59%
F1 Score: 85.59%


Intended Uses & Limitations

This model is designed for tasks such as:

  • Extractive Question Answering
  • Reading comprehension applications

Known Limitations:

  • As DistilBERT is distilled from BERT, its smaller architecture may limit its performance compared to the original BERT model on certain tasks.
  • The model's predictions may be biased or overly reliant on the training dataset, as SQuAD comprises structured and fact-based question-answer pairs.

How to Get Started with the Model

Use the code below to get started with the model:

import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

# Load the model and tokenizer
model_name = "YourHuggingFaceModelPath/distilbert_squad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

context = """
Thomas Edison is credited with inventing the light bulb. He was born in 1847 and was a prolific inventor.
"""
question = "Who invented the light bulb?"

inputs = tokenizer(question, context, return_tensors="pt", truncation=True, max_length=512)
input_ids = inputs["input_ids"].to(device)
attention_mask = inputs["attention_mask"].to(device)

# Perform inference
with torch.no_grad():
    outputs = model(input_ids=input_ids, attention_mask=attention_mask)
    start_scores = outputs.start_logits
    end_scores = outputs.end_logits

# Get start and end indices
start_idx = torch.argmax(start_scores)
end_idx = torch.argmax(end_scores) + 1

# Decode the answer
if start_idx >= end_idx:
    print("Model did not predict a valid answer. Please check context and question.")
else:
    answer = tokenizer.convert_tokens_to_string(
        tokenizer.convert_ids_to_tokens(input_ids[0][start_idx:end_idx])
    )
    print(f"Question: {question}")
    print(f"Answer: {answer}")

Training Details

Training Data

The model was fine-tuned using the SQuAD dataset, a benchmark for context-based question-answering tasks. It contains Wikipedia passages and human-annotated questions with answers.

Training Procedure

Training Objective
The model was trained to extract answers to specific questions based on a provided passage, leveraging DistilBERT's reduced architecture for faster inference and training.

Optimization

  • Optimizer: AdamW
  • Learning Rate Scheduler: Linear scheduler with warm-up steps
  • Steps: 2000 training steps with early stopping

Hardware and Resources

  • GPU: Tesla P100 or NVIDIA T4
  • Framework: Hugging Face Transformers

Training Metrics

The model achieved the following metrics during training and evaluation:

Step Training Loss Validation Loss Exact Match Squad F1 Start Accuracy End Accuracy
100 0.719900 0.941330 84.66% 84.66% 84.66% 89.92%
500 0.640500 0.555793 84.87% 84.87% 84.87% 89.92%
1000 0.413100 0.551416 84.93% 84.93% 84.93% 89.92%
1500 0.522600 0.518057 85.17% 85.17% 85.17% 89.92%
2000 0.464500 0.504376 85.59% 85.59% 85.59% 89.92%

Results Summary

The fine-tuned DistilBERT model achieved:

  • Exact Match (EM): 85.59%
  • F1 Score: 85.59%
  • Validation Loss: 0.504376

This highlights DistilBERT’s efficiency and accuracy for context-based question-answering tasks when fine-tuned on SQuAD.


BibTeX

@misc{distilbert_squad_finetune,
  title = {DistilBERT Fine-tuned for SQuAD},
  author = {Sadat Parvej},
  year = {2024},
  url = {https://huggingface.co/your-model-repository}
}

Let me know if you need further adjustments or additions!

Downloads last month
5
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.