arxiv:2108.13275

The effects of data size on Automated Essay Scoring engines

Published on Aug 30, 2021

Authors:

Christopher Ormerod ,

Abstract

We study the effects of data size and quality on the performance on Automated Essay Scoring (AES) engines that are designed in accordance with three different paradigms; A frequency and hand-crafted feature-based model, a recurrent neural network model, and a pretrained transformer-based language model that is fine-tuned for classification. We expect that each type of model benefits from the size and the quality of the training data in very different ways. Standard practices for developing training data for AES engines were established with feature-based methods in mind, however, since neural networks are increasingly being considered in a production setting, this work seeks to inform us as to how to establish better training data for neural networks that will be used in production.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2108.13275 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2108.13275 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2108.13275 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.