Bias/Fairness evaluation unclear

#3
by kmargatina - opened

How can I reproduce the results for the bias/fairness evaluation? It is not clear from the paper how you cast CrowS-Pairs, WinoGender and WinoBias as classification tasks. Did you use specific templates for these tasks?

"For each dataset, we evaluate between 5 and 10 prompts.": What does this mean?

Thank you in advance!

BigScience Workshop org

Hi @kmargatina ,

You will find the prompts used for the bias&fairness evaluations directly on promptsource (https://github.com/bigscience-workshop/promptsource).
If you want to limit variance and risk of version mismatch with the numbers reported in the card, I would recommend taking the v0.1 or v0.2 version of the repo.

Victor

Thank you Victor!

VictorSanh changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment