arxiv:2306.07384

Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse Scaling

Published on Jun 12, 2023

Authors:

Akshat Gupta

Abstract

With their increasing size, <PRE_TAG><PRE_TAG>large language models (LLMs)</POST_TAG></POST_TAG> are becoming increasingly good at <PRE_TAG>language understanding tasks</POST_TAG>. But even with high performance on specific downstream task, LLMs fail at simple <PRE_TAG>linguistic tests</POST_TAG> for <PRE_TAG>negation</POST_TAG> or <PRE_TAG><PRE_TAG>quantifier understanding</POST_TAG></POST_TAG>. Previous work on quantifier understanding in LLMs show <PRE_TAG><PRE_TAG>inverse scaling</POST_TAG></POST_TAG> in understanding few-type quantifiers. In this paper, we question the claims of of previous work and show that it is a result of inappropriate <PRE_TAG><PRE_TAG>testing methodology</POST_TAG></POST_TAG>. We also present <PRE_TAG><PRE_TAG>alternate methods</POST_TAG></POST_TAG> to measure quantifier comprehension in LLMs and show that LLMs are able to better understand the difference between the meaning of few-type and <PRE_TAG><PRE_TAG>most-type quantifiers</POST_TAG></POST_TAG> as their size increases, although they are not particularly good at it. We also observe <PRE_TAG><PRE_TAG>inverse scaling</POST_TAG></POST_TAG> for most-type <PRE_TAG><PRE_TAG>quantifier understanding</POST_TAG></POST_TAG>, which is contrary to human psycho-linguistic experiments and previous work, where the <PRE_TAG>model's understanding</POST_TAG> of most-type quantifier gets worse as the model size increases. We do this evaluation on models ranging from 125M-175B parameters, which suggests that LLMs do not do as well as expected with quantifiers. We also discuss the possible reasons for this and the relevance of <PRE_TAG><PRE_TAG>quantifier understanding</POST_TAG></POST_TAG> in evaluating language understanding in LLMs.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2306.07384 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2306.07384 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2306.07384 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.