ELECTRA small discriminator QUT unacceptable student feedback

Table of Contents

Model Details

Model Description: This is a fine-tuned checkpoint of electra-small-discriminator, fine-tuned on Queensland University of Technology (QUT) Student Voice Survey (SVS) comments.

The model is used to automatically detect unacceptable comments left by students in QUT's Student Evaluation of Teaching (SET) surveys, the SVS.

Examples of unacceptable comments are those containing abuse, insults, discrimination, indications of a risk of harm, or otherwise deemed unacceptable according to the QUT Evaluation of Courses, Units, Teaching and Student Experience Policy.

Comments are labelled as Acceptable (0) or Unacceptable (1), with the probability of the classification used to a further discern which comments are screened manually by University staff.

This model serves as part of QUT's methodology for screening SVS comments to prevent harm to the wellbeing and career prospects of academics, further detailed here.

  • Developed by:
    • Rick Somers - Evaluation Support Officer, QUT
    • Sam Cunningham - Senior Lecturer, QUT
  • Model Type: Text Classification
  • Language(s): English
  • License: Apache-2.0
  • Parent Model: For more details about ELECTRA, we encourage users to check out this model card
  • Resources for more information:

How to Get Started With the Model

The model can be used directly through our interactive Google Colab Notebook for ease of use. image/png

We also have a web app which can be made available upon special request:

Alternatively, it can be used through the transformers library. Here is an example:

import torch
from transformers import AutoTokenizer, ElectraForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("rickSomers/electra-small-QUT-unacceptable-student-feedback")
model = ElectraForSequenceClassification.from_pretrained("rickSomers/electra-small-QUT-unacceptable-student-feedback")
inputs = tokenizer("Thank you for your engaging tutorials", return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]

Uses

Direct Use

This model is intended be used for text classification. You can use this finetuned model directly for acceptable/unacceptable text classification, or it can be further fine-tune for downstream tasks.

Misuse and Out-of-scope Use

This model should not be used without supporting processes and/or human evaluation. It should not be used to intentionally create hostile, harmful, or alienating environments for people.

Risks, Limitations and Biases

Based on the nature of the training data being from a large, urban, Australian university, we observe that this model could produce biased predictions.

The performance of the model is therefore unknown in other contexts. While we believe that it could be beneficial for other instutes to adopt it in their SET screening, nuisanced differences in question type, language styles and choices, and local context could lead to varying performance.

We strongly advise users to thoroughly investigate their needs and use-case for this model, and evaluate its performance thoroughly before formally implementing into policies and processes. It is not recommended to use this model as the sole screening mechanism for harmful content.

Training

Training Data

The model was trained on 2021-2022 QUT SVS qualitative survey responses. Each comment was asigned a label of 0 or 1, acceptable or unacceptable. Labels were asigned from previous keyword matching algorithms, deprecated unacceptable machine learning models, and staff raised lists.

Training Procedure

Training was undertaken over several months to produce optimal performance. Oversampling techniques were used routinely to better balance the dataset due to a relatively small number of unacceptable comments.

Fine-tuning hyper-parameters

  • learning_rate = 5e-5
  • batch_size = 8
  • max_position_embeddings = 512
  • num_train_epochs = 2

Evaluation and Updates

The performance of the model on new SVS responses are continually monitored and improved.

Internally, the model is routinely evaluated on the each semesters' data. To improve performance, it is often completely retrained from its base model again, or simply further fine-tuned with the inclusion of new text data.

Note that the model uploaded here is a deprecated version (trained on only 2021 & 2022 data) of the current model used by QUT.

Downloads last month
7
Inference Examples
This model is not currently available via any of the supported Inference Providers.

Model tree for rickSomers/electra-small-QUT-unacceptable-student-feedback

Finetuned
(33)
this model