Llama3- All Configurations
This is a summary of the entire configuration file for the fine tuning run of Llama3 usijng 8bit Lora PEFT.
The training configuration file looks mostly fine, but there are a few things that could be adjusted or clarified:
The
dataset_prepared_path
is empty. If you want to use a prepared dataset, you should provide a path here.The
lora_target_modules
parameter is missing. You might want to specify which modules to apply LoRA to, such asq_proj
,v_proj
,k_proj
,o_proj
, etc.The
wandb_watch
,wandb_name
, andwandb_log_model
parameters are empty. If you want to use Weights & Biases for logging and tracking, you should provide appropriate values for these parameters.The
bf16
parameter is set toauto
. Make sure your hardware supports bfloat16 if you want to use it. Otherwise, you can set it tofalse
.The
early_stopping_patience
,resume_from_checkpoint
, andlocal_rank
parameters are empty. If you want to use early stopping, resume from a checkpoint, or perform distributed training, you should provide appropriate values for these parameters.The
eval_table_size
andeval_max_new_tokens
parameters are present but not typically used in the training configuration. These parameters are usually used during evaluation or inference.The
fsdp
andfsdp_config
parameters are empty. If you want to use FullyShardedDataParallel for distributed training, you should provide appropriate configurations.The
special_tokens
parameter only specifies thepad_token
. If your model requires additional special tokens, such asbos_token
,eos_token
, or custom tokens, you should add them here.
Other than these points, the configuration looks valid. Make sure to double-check the paths to your dataset and output directory, and ensure that the hyperparameters are suitable for your specific training task and hardware setup.
Last updated