Page cover image

Llama3 - Lora Configuration

This is the default Lora Configuration for Llama3

adapter: lora
lora_model_dir:
lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

lora_r

This parameter determines the rank of the low-rank matrices used in LoRA.

It controls the capacity and expressiveness of the LoRA adaptation. A higher value of lora_r allows for more fine-grained adaptations but also increases the number of trainable parameters.

In this configuration, lora_r is set to 32.

lora_alpha

This parameter controls the scaling factor applied to the LoRA adaptation.

It determines the contribution of the LoRA matrices to the original model's weights.

A higher value of lora_alpha gives more importance to the LoRA adaptation. In our configuration, lora_alpha is set to 16.

lora_dropout

This parameter specifies the dropout rate applied to the LoRA matrices during training.

Dropout is a regularization technique that helps prevent overfitting.

A value of 0.05 means that 5% of the elements in the LoRA matrices will be randomly set to zero during training.

lora_target_modules

This parameter specifies the names of the modules in the model architecture where LoRA will be applied.

In this configuration, LoRA is applied to the q_proj and v_proj modules, which are likely the query and value projection matrices in the attention mechanism.

We have commented out other potential target modules like k_proj, o_proj, gate_proj, down_proj, and up_proj.

lora_target_linear

This parameter is not set in your configuration.

If set to true, LoRA will be applied to all linear modules in the model.

peft_layers_to_transform

This parameter allows you to specify the indices of the layers to which LoRA should be applied.

If not specified, LoRA will be applied to all layers by default.

lora_modules_to_save

This parameter is relevant when you have added new tokens to the tokenizer. In such cases, you may need to save certain LoRA modules that are aware of the new tokens.

For LLaMA and Mistral models, you typically need to save embed_tokens and lm_head modules. embed_tokens converts tokens to embeddings, and lm_head converts embeddings to token probabilities.

In this configuration, these modules are commented out.

lora_fan_in_fan_out

This parameter determines the structure of the LoRA matrices. If set to true, it uses a more efficient implementation of LoRA that reduces the number of additional parameters. In your configuration, it is set to false.

These hyperparameters allow you to control various aspects of the LoRA adaptation during fine-tuning.

The optimal values for these hyperparameters may vary depending on your specific task, dataset, and model architecture. It's recommended to experiment with different configurations and monitor the performance to find the best settings for your use case.

Last updated

Logo

This documentation is for the Axolotl community