Do you have the training loss log that can use for reference?
#51
by
junliu44
- opened
I want to do a training test for benchmark.
junliu44
changed discussion status to
closed
@loubnabnl thank you very much!
BTW, I have another detail question:
For the training data set, does Starcoder use document-level sampling or sampling based on context length segmentation?
e.g. | code file example 1 | code file example 2 | ...... | or | ctx_length code snippet 1 | ctx_length code snippet 2 | ...... |
junliu44
changed discussion status to
open
we do sequence packing where tokenized documents are concatenated and separated by eos_token, then split to 8192 sequences that we sample from
loubnabnl
changed discussion status to
closed