Llama3 - Logging

We will be using Weights and Biases to log and monitor our fine tuning runs.

If you are new to Weights and Biases, please take the time to review their quickstart guide:

A run is the basic building block of W&B. You will use them often to track metrics, create logs, create jobs.

Once you have created a Weights and Biases account, login to your Weights and Biases account by entering the following command at the command prompt:

wandb login

You will be asked to enter your Weights and Biases account details as per below:

Username:

Password:

API Token:

Configuration of Logging Component

Project

The project naming convention can be simplistic but informational. Name your project whatever you like.

Entity

The entity is simply <your-organisation>.

The configuration file should look like the below script:

wandb_project: <your project name>
wandb_entity: <your-organisation> 
wandb_name: Set the name of your wandb run
wandb_run_id: Set the ID of your wandb run

Reference: The Weights and Biases Script in the Axolotl Library

src/axolotl/utils/wandb_.py

The script defines a function setup_wandb_env_vars that configures environment variables for wandb based on a given configuration. Here's a detailed explanation of what this function does:

Module Description

The comment """Module for wandb utilities""" indicates that this Python file is intended to provide utility functions for working with wandb.

Import Statements

The script imports the os module, which provides functions for interacting with the operating system, including managing environment variables.
It imports DictDefault from axolotl.utils.dict, which is a custom dictionary utility, likely providing some default behavior for dictionary operations.

Function Definition

setup_wandb_env_vars(cfg: DictDefault):

The function takes a single parameter cfg, which is expected to be an instance of DictDefault. This parameter likely contains configuration settings.

Setting Environment Variables for wandb

The function iterates through all keys in the cfg dictionary.
If a key starts with "wandb_", the function retrieves its value.
If the value is a non-empty string, the function sets an environment variable with the name of the key in uppercase and assigns it the value from cfg. For example, if cfg contains {"wandb_api_key": "your_key"}, it sets an environment variable WANDB_API_KEY with the value "your_key".

Enabling wandb Integration

The function checks if cfg.wandb_project exists and is a non-empty string.
If cfg.wandb_project is set, cfg.use_wandb is set to True, and any existing WANDB_DISABLED environment variable is removed. This implies that wandb should be enabled and operational for the current session.
If cfg.wandb_project is not set, the function sets the environment variable WANDB_DISABLED to "true", effectively disabling wandb integration.

Usage Context

This script is useful for dynamically configuring wandb based on a set of configuration parameters, especially in scenarios where different wandb settings are needed for different runs or experiments.
It automates the process of setting up wandb environment variables, enabling or disabling wandb tracking based on the provided configuration.

In summary, the script is a utility for configuring wandb environment variables in a Python environment based on a provided configuration, enabling or disabling wandb tracking as needed. This is particularly useful in machine learning workflows where experiment tracking needs to be managed programmatically.

PreviousLlama3 - Lora Configuration NextLlama3 - Training Configuration

Last updated 1 year ago

Was this helpful?