Model Card for appropriateness-classifier-binary
This model classifies an argument as appropriate or inappropriate. In the case of the latter, the model also provides reasons for the inappropriateness according to the hierarchical taxonomy of our corpus (see here) For further details on (in)appropriateness, we refer to the paper below and the corpora used for training.
Model Details
Model Sources
- Repository: https://github.com/timonziegenbein/appropriateness-corpus
- Paper [optional]: Modeling Appropriate Language in Argumentation
How to Get Started with the Model
Use the code below to get started with the model.
import torch
from transformers.models.deberta_v2.modeling_deberta_v2 import *
from transformers import AutoTokenizer, AutoModel
DIMS = [
'Inappropriateness',
'Toxic Emotions',
'Excessive Intensity',
'Emotional Deception',
'Missing Commitment',
'Missing Seriousness',
'Missing Openness',
'Missing Intelligibility',
'Unclear Meaning',
'Missing Relevance',
'Confusing Reasoning',
'Other Reasons',
'Detrimental Orthography',
'Reason Unclassified'
]
class AppropriatenessMultilabelModel(DebertaV2PreTrainedModel):
def __init__(self, config):
super().__init__(config)
num_labels = getattr(config, "num_labels", 2)
self.num_labels = num_labels
self.deberta = DebertaV2Model(config)
self.pooler = ContextPooler(config)
output_dim = self.pooler.output_dim
self.classifier = nn.Linear(output_dim, num_labels)
drop_out = getattr(config, "cls_dropout", None)
drop_out = self.config.hidden_dropout_prob if drop_out is None else drop_out
self.dropout = StableDropout(drop_out)
# Initialize weights and apply final processing
self.post_init()
def forward(
self,
input_ids: Optional[torch.Tensor] = None,
attention_mask: Optional[torch.Tensor] = None,
token_type_ids: Optional[torch.Tensor] = None,
position_ids: Optional[torch.Tensor] = None,
inputs_embeds: Optional[torch.Tensor] = None,
labels: Optional[torch.Tensor] = None,
output_attentions: Optional[bool] = None,
output_hidden_states: Optional[bool] = None,
return_dict: Optional[bool] = None,
) -> Union[Tuple, SequenceClassifierOutput]:
return_dict = return_dict if return_dict is not None else self.config.use_return_dict
outputs = self.deberta(
input_ids,
token_type_ids=token_type_ids,
attention_mask=attention_mask,
position_ids=position_ids,
inputs_embeds=inputs_embeds,
output_attentions=output_attentions,
output_hidden_states=output_hidden_states,
return_dict=return_dict,
)
encoder_layer = outputs[0]
pooled_output = self.pooler(encoder_layer)
pooled_output = self.dropout(pooled_output)
logits = self.classifier(pooled_output)
loss = None
if labels is not None:
if self.config.problem_type is None:
if self.num_labels == 1:
self.config.problem_type = "regression"
elif self.num_labels > 1 and (labels.dtype == torch.long or labels.dtype == torch.int):
self.config.problem_type = "single_label_classification"
else:
self.config.problem_type = "multi_label_classification"
if self.config.problem_type == "regression":
loss_fct = MSELoss()
if self.num_labels == 1:
loss = loss_fct(logits.squeeze(), labels.squeeze())
else:
loss = loss_fct(logits, labels)
elif self.config.problem_type == "single_label_classification":
loss_fct = CrossEntropyLoss()
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
elif self.config.problem_type == "multi_label_classification":
loss_fct = BCEWithLogitsLoss(pos_weight=self.pos_weights)
loss = loss_fct(logits, labels)
if not return_dict:
output = (logits,) + outputs[2:]
return ((loss,) + output) if loss is not None else output
return SequenceClassifierOutput(
loss=loss,
logits=logits,
hidden_states=outputs.hidden_states,
attentions=outputs.attentions,
)
tokenizer = AutoTokenizer.from_pretrained("timonziegenbein/appropriateness-classifier-multilabel")
model = AppropriatenessMultilabelModel.from_pretrained('timonziegenbein/appropriateness-classifier-multilabel', num_labels=14, problem_type="multi_label_classification").cuda()
argument = ''''Towed three times and impounded for 30 days each time? Man, you're just not getting the message, are you? If you are in California, you bet the police can forfeit your vehicle and it doesn't take three times to make it a charm. Technically, your vehicle could be subject to forfeiture proceedings after your first suspended license beef. Someone like you is exactly the reason the legislature designed that law, because your privilege to drive has been taken away from you and yet you obviously continue to drive. People like you are involved in an exponentially higher than average number of traffic accidents so the legislature figured maybe people like you should have your vehicles forfeited to the state if you just didn't go along with the game plan. Voila - I give you California Vehicle Code section 14607.6...and a link to it below. It would also be worth your time to review 14607.4, whether or not you live in California. You really need to stop driving. Really.'''
input_ids = tokenizer(argument, return_tensors="pt").to('cuda:0')
with torch.no_grad():
logits = model(**input_ids).logits
probabilities = torch.sigmoid(logits).cpu()
predictions = torch.round(probabilities).numpy()[0]
for dim, prediction in zip(DIMS, predictions):
print(dim+': '+str(bool(prediction)))
# Inappropriateness: True
# Toxic Emotions: True
# Excessive Intensity: True
# Emotional Deception: True
# Missing Commitment: True
# Missing Seriousness: True
# Missing Openness: True
# Missing Intelligibility: True
# Unclear Meaning: True
# Missing Relevance: True
# Confusing Reasoning: False
# Other Reasons: False
# Detrimental Orthography: False
# Reason Unclassified: False
Citation
If you are interested in using the corpus, please cite the following paper: Modeling Appropriate Language in Argumentation (Ziegenbein et al., ACL 2023)
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.