Commits · Dovakiins/qwerrwe

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Johan Hansson

winglian commited on Jan 9, 2024

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

ricdomolm

winglian commited on Jan 9, 2024

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

ricdomolm

winglian commited on Jan 8, 2024

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8, 2024

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 [email protected]

winglian commited on Jan 6, 2024

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

Nanobit commited on Jan 5, 2024

RL/DPO (#935)

f243c21

winglian commited on Jan 4, 2024

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

winglian commited on Jan 2, 2024

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

winglian commited on Dec 28, 2023

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

Younes Belkada

winglian commited on Dec 27, 2023

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

Nanobit commited on Dec 17, 2023

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

kallewoof commited on Dec 13, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

user735 Karl-Johan Alm commited on Dec 4, 2023

Feat: Add warmup_ratio (#893)

fb12895
unverified

Nanobit commited on Nov 25, 2023

don't train if eval split is too small (#873)

797f3dd
unverified

winglian commited on Nov 16, 2023

various bugfixes (#856)

1470650
unverified

winglian commited on Nov 15, 2023

cleanup the old multipack dataloader (#841)

1a6309c
unverified

winglian commited on Nov 12, 2023

multipack w batch sampler (#795)

641e6f7
unverified

winglian commited on Nov 8, 2023

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

casperhansen commited on Oct 26, 2023

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified

winglian commited on Oct 23, 2023

Spaces:

Dovakiins
/

qwerrwe

Build error

Commit History

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

RL/DPO (#935)

f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

don't train if eval split is too small (#873)

797f3dd
unverified

various bugfixes (#856)

1470650
unverified

cleanup the old multipack dataloader (#841)

1a6309c
unverified

multipack w batch sampler (#795)

641e6f7
unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified

Commit History

Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified

Cosine learning rate schedule - minimum learning rate (#1062) 04b978b unverified

Efficiently get the length of the tokenized docs (#1063) 81d3845 unverified

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

feat: always push checkpoint to hub if set (#1049) [skip ci] cbdbf9e unverified

RL/DPO (#935) f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021) 4d2e842 unverified

remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified

FEAT: add tagging support to axolotl (#1004) db9094d unverified

fix: add lr scheduler kwargs to Trainer (#972) 13e9381 unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified

Feat: Add warmup_ratio (#893) fb12895 unverified

don't train if eval split is too small (#873) 797f3dd unverified

various bugfixes (#856) 1470650 unverified

cleanup the old multipack dataloader (#841) 1a6309c unverified

multipack w batch sampler (#795) 641e6f7 unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759) 05bd6f1 unverified

refactor setup trainer so we can add more hooks (#773) 6c81c61 unverified

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

RL/DPO (#935)

f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

don't train if eval split is too small (#873)

797f3dd
unverified

various bugfixes (#856)

1470650
unverified

cleanup the old multipack dataloader (#841)

1a6309c
unverified

multipack w batch sampler (#795)

641e6f7
unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified