When answering questions in Chinese, the model frequently terminates prematurely (outputs the end token). Is this a common problem?

#40
by zhangw355 - opened

As described in the title, I downloaded the model and carried out inference using the Transformer framework and the vLLM framework respectively. When asking questions in Chinese, the situation of premature termination of the answers would occur. May I ask whether this is a known issue? What are the probable causes and how can it be solved?

Sign up or log in to comment