first
Browse files- README.md +3 -1
- events.out.tfevents.1667732309.t1v-n-101cf975-w-0.1839856.0.v2 +3 -0
- flax_model.msgpack +3 -0
- pytorch_model.bin +3 -0
- run.sh +2 -2
README.md
CHANGED
@@ -24,4 +24,6 @@ widget:
|
|
24 |
|
25 |
# Scandinavian XLM-RoBERTa (base-sized model)
|
26 |
|
27 |
-
This model is currently being created. Do not use yet.
|
|
|
|
|
|
24 |
|
25 |
# Scandinavian XLM-RoBERTa (base-sized model)
|
26 |
|
27 |
+
This model is currently being created. Do not use yet.
|
28 |
+
|
29 |
+
Adjusting down lr from 1e4 to 5e5 since we have some instability. Restarting Nov 6. Training only 500k steps.
|
events.out.tfevents.1667732309.t1v-n-101cf975-w-0.1839856.0.v2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:faa4b7b6481395d636dde622fbfede7a6d179ea93e0afdefe38f2c9e95ae8430
|
3 |
+
size 28280812
|
flax_model.msgpack
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e4de7fc184dcf3cc1b013651a25806d1417baa8be6140839cd6bdb58b3f0d2bb
|
3 |
+
size 1113187999
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:949e34bfaa78f30854bf24df0994074ba8bf900ca0016370c1e54e04120c689c
|
3 |
+
size 1113251641
|
run.sh
CHANGED
@@ -8,10 +8,10 @@ python run_mlm_flax_stream.py \
|
|
8 |
--weight_decay="0.01" \
|
9 |
--per_device_train_batch_size="62" \
|
10 |
--per_device_eval_batch_size="62" \
|
11 |
-
--learning_rate="
|
12 |
--warmup_steps="10000" \
|
13 |
--overwrite_output_dir \
|
14 |
-
--num_train_steps="
|
15 |
--adam_beta1="0.9" \
|
16 |
--adam_beta2="0.98" \
|
17 |
--logging_steps="5000" \
|
|
|
8 |
--weight_decay="0.01" \
|
9 |
--per_device_train_batch_size="62" \
|
10 |
--per_device_eval_batch_size="62" \
|
11 |
+
--learning_rate="5e-5" \
|
12 |
--warmup_steps="10000" \
|
13 |
--overwrite_output_dir \
|
14 |
+
--num_train_steps="500000" \
|
15 |
--adam_beta1="0.9" \
|
16 |
--adam_beta2="0.98" \
|
17 |
--logging_steps="5000" \
|