DPO Version

#5
by KnutJaegersberg - opened

How would this model behave if one would do the UltraLM DPO training?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment