Llama2 - Training Configuration
This is the default training configuration. We will leave it as is for the time being.
gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 4
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
Last updated
Was this helpful?