Llama3 - Extra Hyperparameters
warmup_steps: 10
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
pad_token: <|end_of_text|>Training Hyperparameters
Deepspeed and FSDP
Special Tokens and Token Embeddings
Last updated
Was this helpful?

