Page cover image

Phi 2.0 - Training

Training Phi 2.0

To execute the main training script Axolotl provides this command to begin training Phi 2.0

accelerate launch -m axolotl.cli.train examples/phi/phi2-ft.yml

If you have not already done so, you will be asked to enter your Weights and Biases API Key.

Enter the key at the command line prompt:

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

An analysis of the axolotl.clt.train module

Analysis of train.py script

The train.py script in the Axolotl platform is a Command Line Interface (CLI) tool designed for training machine learning models.

This script is structured to provide a user-friendly interface for configuring and executing model training. Here's a detailed analysis:

Script Structure and Functionality

Imports and Logger Setup

  • Essential modules like logging, pathlib.Path, fire, and transformers are imported.

  • The script sets up a logger LOG using the logging module for logging various events and statuses during the script's execution.

do_cli Function

  • Function Definition: The do_cli function is the main entry point of the script. It accepts a config argument (with a default value pointing to an "examples" directory) and **kwargs for additional arguments.

  • ASCII Art Display: print_axolotl_text_art() is called to display ASCII art, likely for aesthetic purposes.

  • Configuration Loading: load_cfg loads configuration details from the provided config path. These configurations are essential for setting up model training parameters.

  • Accelerator and User Token Checks: The script verifies the default configuration for the accelerator (such as a GPU) and checks the user token. These checks are crucial for ensuring that the hardware is correctly set up and the user is authenticated.

  • CLI Arguments Parsing: It uses transformers.HfArgumentParser to parse additional CLI arguments into data classes (TrainerCliArgs). This step allows for dynamic customization of training parameters via the command line.

  • Dataset Loading: load_datasets is called with the parsed configuration and CLI arguments. This function is responsible for loading the dataset as per the configuration, which is a critical step in the training process.

  • Model Training: The train function is invoked with the loaded configuration, CLI arguments, and dataset metadata. This function likely encompasses the core logic for model training.

Main Block

  • The script checks if it's being run as the main program (__name__ == "__main__") and not as a module in another script. If it's the main program, it uses fire.Fire(do_cli) to execute the do_cli function, enabling the script to be interacted with from the command line.

Last updated

Logo

This documentation is for the Axolotl community