Llama3 - Lora Configuration
This is the default Lora Configuration for Llama3
lora_r
lora_r
This parameter determines the rank of the low-rank matrices used in LoRA.
It controls the capacity and expressiveness of the LoRA adaptation. A higher value of lora_r
allows for more fine-grained adaptations but also increases the number of trainable parameters.
In this configuration, lora_r
is set to 32.
lora_alpha
lora_alpha
This parameter controls the scaling factor applied to the LoRA adaptation.
It determines the contribution of the LoRA matrices to the original model's weights.
A higher value of lora_alpha
gives more importance to the LoRA adaptation. In our configuration, lora_alpha
is set to 16.
lora_dropout
lora_dropout
This parameter specifies the dropout rate applied to the LoRA matrices during training.
Dropout is a regularization technique that helps prevent overfitting.
A value of 0.05 means that 5% of the elements in the LoRA matrices will be randomly set to zero during training.
lora_target_modules
lora_target_modules
This parameter specifies the names of the modules in the model architecture where LoRA will be applied.
In this configuration, LoRA is applied to the q_proj
and v_proj
modules, which are likely the query and value projection matrices in the attention mechanism.
We have commented out other potential target modules like k_proj
, o_proj
, gate_proj
, down_proj
, and up_proj
.
lora_target_linear
lora_target_linear
This parameter is not set in your configuration.
If set to true
, LoRA will be applied to all linear modules in the model.
peft_layers_to_transform
peft_layers_to_transform
This parameter allows you to specify the indices of the layers to which LoRA should be applied.
If not specified, LoRA will be applied to all layers by default.
lora_modules_to_save
lora_modules_to_save
This parameter is relevant when you have added new tokens to the tokenizer. In such cases, you may need to save certain LoRA modules that are aware of the new tokens.
For LLaMA and Mistral models, you typically need to save embed_tokens
and lm_head
modules. embed_tokens
converts tokens to embeddings, and lm_head
converts embeddings to token probabilities.
In this configuration, these modules are commented out.
lora_fan_in_fan_out
lora_fan_in_fan_out
This parameter determines the structure of the LoRA matrices. If set to true
, it uses a more efficient implementation of LoRA that reduces the number of additional parameters. In your configuration, it is set to false
.
These hyperparameters allow you to control various aspects of the LoRA adaptation during fine-tuning.
The optimal values for these hyperparameters may vary depending on your specific task, dataset, and model architecture. It's recommended to experiment with different configurations and monitor the performance to find the best settings for your use case.
Last updated