YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Duplicate Sentence Detection with ALBERT-base-v2

πŸ“Œ Overview

This repository hosts the quantized version of the ALBERT-base-v2 model for Duplicate Sentence Detection. The model is designed to determine whether two sentences convey the same meaning. If they are similar, the model outputs "duplicate" with a confidence score; otherwise, it outputs "not duplicate" with a confidence score. The model has been optimized for efficient deployment while maintaining reasonable accuracy, making it suitable for real-time applications.

πŸ— Model Details

  • Model Architecture: ALBERT-base-v2
  • Task: Duplicate Sentence Detection
  • Dataset: Hugging Face's quora-question-pairs
  • Quantization: Float16 (FP16) for optimized inference
  • Fine-tuning Framework: Hugging Face Transformers

πŸš€ Usage

Installation

pip install transformers torch

Loading the Model

from transformers import AlbertTokenizer, AlbertForSequenceClassification
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_name = "AventIQ-AI/albert-duplicate-sentence-detection"
model = AlbertForSequenceClassification.from_pretrained(model_name).to(device)
tokenizer = AlbertTokenizer.from_pretrained(model_name)

Paraphrase Detection Inference

def predict_duplicate(question1, question2, model):
    inputs = tokenizer(question1, question2, truncation=True, padding="max_length", max_length=128, return_tensors="pt")
    
    # βœ… Move inputs to the same device as the model
    inputs = {key: value.to(device) for key, value in inputs.items()}
    
    with torch.no_grad():  # Disable gradient calculation
        outputs = model(**inputs)
        logits = outputs.logits
    
    # βœ… Get prediction
    probs = torch.softmax(logits, dim=1)
    prediction = torch.argmax(probs, dim=1).item()
 
    # βœ… Output the results
    label_map = {0: "Not Duplicate", 1: "Duplicate"}
    print(f"Q1: {question1}")
    print(f"Q2: {question2}")
    print(f"Prediction: {label_map[prediction]} (Confidence: {probs.max().item():.4f})\n")

# πŸ” Test Example
test_samples = [
    ("How can I learn Python quickly?", "What is the fastest way to learn Python?"),  # Duplicate
    ("What is the capital of India?", "Where is New Delhi located?"),  # Duplicate
    ("How to lose weight fast?", "What is the best programming language to learn?"),  # Not Duplicate
    ("Who is the CEO of Tesla?", "What is the net worth of Elon Musk?"),  # Not Duplicate
    ("What is machine learning?", "How does AI work?"),  # Duplicate
]
for q1, q2 in test_samples:
    predict_duplicate(q1, q2, model)

πŸ“Š Quantized Model Evaluation Results

πŸ”₯ Evaluation Metrics πŸ”₯

  • βœ… Accuracy: 0.7215
  • βœ… Precision: 0.6497
  • βœ… Recall: 0.5440
  • βœ… F1-score: 0.5922

⚑ Quantization Details

Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy.

πŸ“‚ Repository Structure

.
β”œβ”€β”€ model/               # Contains the quantized model files
β”œβ”€β”€ tokenizer_config/    # Tokenizer configuration and vocabulary files
β”œβ”€β”€ model.safetensors/   # Quantized Model
β”œβ”€β”€ README.md            # Model documentation

⚠️ Limitations

  • The model may struggle with highly nuanced paraphrases.
  • Quantization may lead to slight degradation in accuracy compared to full-precision models.
  • Performance may vary across different domains and sentence structures.

🀝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.

Downloads last month
81
Safetensors
Model size
11.7M params
Tensor type
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Space using AventIQ-AI/albert-duplicate-sentence-detection 1