File size: 3,528 Bytes
65f9bc0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
301328a
19eac34
301328a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
739d30f
 
 
 
301328a
 
 
 
 
 
 
 
 
 
 
 
739d30f
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: apache-2.0
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
library_name: transformers
tags:
- sequence-classification
- glue
- mrpc
- bert
- transformers
---

# BERT Paraphrase Detection (GLUE MRPC)

This model is fine-tuned for the **paraphrase detection** task on the GLUE MRPC dataset. It determines whether two given sentences are paraphrases (i.e., if they have the same meaning or not). This is a binary classification task with the following labels:

- **1**: Paraphrase
- **0**: Not a paraphrase

## Model Overview

- **Developer**: Parit Kasnal
- **Model Type**: Sequence Classification (Binary)
- **Language(s)**: English
- **Pre-trained Model**: BERT (bert-base-uncased)

## Intended Use

This model is designed to assess whether two sentences convey the same meaning. It can be applied in various scenarios, including:

- **Duplicate Question Detection**: Identifying similar questions in QA systems.
- **Plagiarism Detection**: Detecting if content is copied and rephrased.
- **Summarization Alignment**: Matching sentences from summaries to the original content.

## Example Usage

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load the fine-tuned model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("Parit1/dummy")
tokenizer = AutoTokenizer.from_pretrained("Parit1/dummy")

def make_prediction(text1, text2):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    inputs = tokenizer(text1, text2, truncation=True, padding=True, return_tensors="pt")
    inputs = {k: v.to(device) for k, v in inputs.items()}
    model.to(device)
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    prediction = torch.argmax(logits, dim=-1).item()
    return prediction

# Example usage
text1 = "The quick brown fox jumps over the lazy dog."
text2 = "A fast brown fox leaps over a lazy dog."
prediction = make_prediction(text1, text2)
print(f"Prediction: {prediction}")
```

## Training Details

### Training Data
The model was fine-tuned on the **GLUE MRPC** dataset, which contains pairs of sentences labeled as either paraphrases or not.

### Training Procedure
- **Number of Epochs**: 2
- **Metrics Used**:
  - Accuracy
  - Precision
  - Recall
  - F1 Score

#### Training Logs (Summary)
| **Epoch** | **Avg Loss** | **Accuracy** | **Precision** | **Recall** | **F1 Score** |
|-----------|--------------|--------------|---------------|------------|--------------|
| **1**     | 0.5443       | 73.45%       | 72.28%        | 73.45%     | 70.83%       |
| **2**     | 0.2756       | 89.34%       | 89.25%        | 89.34%     | 89.27%       |

## Evaluation

### Performance Metrics
The model's performance was evaluated using the following metrics:

- **Accuracy**: Percentage of correct predictions.
- **Precision**: Proportion of positive identifications that were actually correct.
- **Recall**: Proportion of actual positives that were correctly identified.
- **F1 Score**: The harmonic mean of Precision and Recall.

### Test Set Results
| **Epoch** | **Avg Loss** | **Accuracy** | **Precision** | **Recall** | **F1 Score** |
|-----------|--------------|--------------|---------------|------------|--------------|
| **1**     | 0.3976       | 82.60%       | 82.26%        | 82.60%     | 81.93%       |
| **2**     | 0.3596       | 84.80%       | 84.94%        | 84.80%     | 84.87%       |