Different scores from the three examples (Full match, Partial match, No match) on the App page

#7
by wentau - opened

I am using unbabel-comet==2.2.4, with the wmt22-comet-da model, getting scores

Full match [1.0, 1.0]
Partial match [0.84, 0.97]
No match [0.47, 0.43]

Is the example using a different model?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment