Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You
Abstract
Text-to-image generation models have recently achieved astonishing results in image quality, flexibility, and text alignment and are consequently employed in a fast-growing number of applications. Through improvements in multilingual abilities, a larger community now has access to this kind of technology. Yet, as we will show, <PRE_TAG>multilingual models</POST_TAG> suffer similarly from (gender) biases as <PRE_TAG>monolingual models</POST_TAG>. Furthermore, the natural expectation is that these models will provide similar results across languages, but this is not the case and there are important differences between languages. Thus, we propose a novel <PRE_TAG>benchmark MAGBIG</POST_TAG> intending to foster research in <PRE_TAG>multilingual models</POST_TAG> without gender bias. We investigate whether multilingual T2I models magnify gender bias with MAGBIG. To this end, we use <PRE_TAG>multilingual prompts</POST_TAG> requesting portrait images of persons of a certain occupation or trait (using <PRE_TAG>adjectives</POST_TAG>). Our results show not only that models deviate from the normative assumption that each gender should be equally likely to be generated, but that there are also big differences across languages. Furthermore, we investigate prompt engineering strategies, i.e. the use of indirect, <PRE_TAG>neutral formulations</POST_TAG>, as a possible remedy for these biases. Unfortunately, they help only to a limited extent and result in worse <PRE_TAG>text-to-image alignment</POST_TAG>. Consequently, this work calls for more research into diverse representations across languages in image generators.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper