Papers
arxiv:2306.07384

Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse Scaling

Published on Jun 12, 2023
Authors:

Abstract

With their increasing size, <PRE_TAG><PRE_TAG>large language models (LLMs)</POST_TAG></POST_TAG> are becoming increasingly good at <PRE_TAG>language understanding tasks</POST_TAG>. But even with high performance on specific downstream task, LLMs fail at simple <PRE_TAG>linguistic tests</POST_TAG> for <PRE_TAG>negation</POST_TAG> or <PRE_TAG><PRE_TAG>quantifier understanding</POST_TAG></POST_TAG>. Previous work on quantifier understanding in LLMs show <PRE_TAG><PRE_TAG>inverse scaling</POST_TAG></POST_TAG> in understanding few-type quantifiers. In this paper, we question the claims of of previous work and show that it is a result of inappropriate <PRE_TAG><PRE_TAG>testing methodology</POST_TAG></POST_TAG>. We also present <PRE_TAG><PRE_TAG>alternate methods</POST_TAG></POST_TAG> to measure quantifier comprehension in LLMs and show that LLMs are able to better understand the difference between the meaning of few-type and <PRE_TAG><PRE_TAG>most-type quantifiers</POST_TAG></POST_TAG> as their size increases, although they are not particularly good at it. We also observe <PRE_TAG><PRE_TAG>inverse scaling</POST_TAG></POST_TAG> for most-type <PRE_TAG><PRE_TAG>quantifier understanding</POST_TAG></POST_TAG>, which is contrary to human psycho-linguistic experiments and previous work, where the <PRE_TAG>model's understanding</POST_TAG> of most-type quantifier gets worse as the model size increases. We do this evaluation on models ranging from 125M-175B parameters, which suggests that LLMs do not do as well as expected with quantifiers. We also discuss the possible reasons for this and the relevance of <PRE_TAG><PRE_TAG>quantifier understanding</POST_TAG></POST_TAG> in evaluating language understanding in LLMs.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2306.07384 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2306.07384 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2306.07384 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.