context size / sampling params / reasoning

#1
by owao - opened

Hey! Thank you really much for putting it out there! And thanks for the effort to make it available through llama.cpp!

Some questions:

  • is the context window 131072 tokens natively without rope scaling?
  • what value would you recommend for max_new_tokens?
  • what sampling params are important and what recommended values?
  • is this a non-reasoning-only model? I didn't find any special tokens like <think> or similar, but the name of the model confused me.

Thanks by advance!

owao changed discussion title from context size and sampling params to context size / sampling params / reasoning

No updates?

Sign up or log in to comment