Commit History

training roberta structure with 4786611 samples, 24054 test samples, 20 vocab size, 3 hidden layers, 256 hidden size, 4 attention heads, 0.15 mlm probability, 10 num process, 512 max length, 0.005 train test split, 50 min sub seq length, 2000 max sub seq length, 42 seed
055353e

LKarlo commited on

training roberta structure with 4786611 samples, 24054 test samples, 20 vocab size, 3 hidden layers, 256 hidden size, 4 attention heads, 0.15 mlm probability, 10 num process, 512 max length, 0.005 train test split, 50 min sub seq length, 2000 max sub seq length, 42 seed
e8054df

LKarlo commited on

initial commit
81e851c

LKarlo commited on