# Platform Installation

With the virtual machine established and optimised, we will now establish the virtual environment for Axolotl

## <mark style="color:blue;">Download the axolotl library</mark>

```bash
git clone https://github.com/OpenAccess-AI-Collective/axolotl.git
```

Then navigate into the library:

```bash
cd axolotl
```

<details>

<summary>Reference: <mark style="color:green;">What does git clone do?</mark></summary>

The command <mark style="color:yellow;">`git clone`</mark> is a command-line instruction used with Git, a version control system, to create a local copy of a remote repository. Let's break down this command to understand each part:

1. <mark style="color:yellow;">**`git clone`**</mark>:
   * <mark style="color:yellow;">`git`</mark>: This is the command-line tool for using Git.
   * <mark style="color:yellow;">`clone`</mark>: A Git command used to copy a remote repository. It not only downloads the content of the repository but also creates a new local repository with all the history and branch information.
2. **Repository URL**
   * <mark style="color:yellow;">`https://github.com/OpenAccess-AI-Collective/axolotl.git`</mark>: This is the URL of the remote Git repository.
   * In this case, the repository is hosted on GitHub, a web-based platform for version control using Git.
   * The repository belongs to a user or organization named `OpenAccess-AI-Collective`, and the specific repository is named `axolotl`.
3. **Executing the Command**
   * When you execute this command in a terminal or command prompt, Git will:
     * Reach out to the URL provided.
     * Download the contents of the repository, including all the files, folders, branches, and history (commits).
     * Create a new directory in your current working directory with the same name as the repository (`axolotl` in this case).
     * Initialize a local Git repository in that directory and link it to the remote repository. This link is important for future commands like `git pull` (to fetch updates) and `git push` (to upload local changes).
4. **Post-Cloning**
   * After cloning, you can navigate into the newly created directory (`cd axolotl`) and start working with the files. You can also switch between branches, update the local repository with changes from the remote repository, and commit new changes.
5. **Purpose**
   * Cloning is often the first step when you want to work on a project from a remote repository, contribute to it, or just have a local copy for reference or backup.

In summary, `git clone https://github.com/OpenAccess-AI-Collective/axolotl.git` is a Git command to create a local copy of the `axolotl` repository from the `OpenAccess-AI-Collective` GitHub page. It sets up a new folder with all the repository data and establishes a connection between the local and remote repositories for ongoing version control.

</details>

### <mark style="color:blue;">Virtual Environment</mark>

<mark style="color:blue;">**Check for Anaconda**</mark><mark style="color:blue;">:</mark>

* Ensure Anaconda is installed and create and <mark style="color:yellow;">Anaconda environment</mark> called <mark style="color:yellow;">'axolotl'</mark>
* Type the following command into the terminal to check if Anaconda is installed and press Enter:

```bash
conda --version
```

If Anaconda is installed, this command will return the version number:

```bash
conda 23.10.0
```

* If you get an error or message saying that <mark style="color:yellow;">`conda`</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">is not recognised,</mark> it means Anaconda is not installed.

If Anaconda is not installed - <mark style="color:blue;">follow the installation instructions below</mark> <mark style="color:blue;"></mark><mark style="color:blue;">**"Installing Miniconda":**</mark>

<details>

<summary><mark style="color:green;">Installing Miniconda</mark></summary>

<mark style="color:blue;">**Overview**</mark>

* **Miniconda** is a minimal installer for the Conda package manager. It provides a lightweight alternative to Anaconda, including only Conda, Python, and essential packages they depend on, plus a few other useful packages such as pip and zlib.
* Ideal for users who prefer a smaller footprint or want more control over installed packages.
* Additional packages can be installed using <mark style="color:yellow;">`conda install`</mark> from Anaconda’s public repository or other channels like conda-forge or bioconda.

<mark style="color:blue;">**Choosing Miniconda**</mark>

* The decision between Anaconda and Miniconda depends on user needs. The Anaconda or Miniconda page provides guidance for choosing the most suitable installation.

<mark style="color:blue;">**System Requirements**</mark>

* System requirements are available on the Miniconda documentation page, ensuring compatibility with user systems.

<mark style="color:blue;">**Installation Links and Release Notes**</mark>

* The latest Miniconda installer links are provided for Python 3.11.5 as of November 16, 2023.
* Installers are available for various platforms, including Windows, macOS (Intel and Apple M1), and Linux (with multiple architecture support).
* SHA256 hashes are provided for verifying the integrity of the downloaded files.
* For older Python versions or an archive of Miniconda versions, links are available on the documentation page.

<mark style="color:blue;">**Quick Installation (Command Line)**</mark>

* Quick installation instructions are provided for a streamlined setup process.
* For Linux, a typical installation involves moving to your home directory and then:

<mark style="color:green;">**Create a directory**</mark>

```bash
mkdir -p ~/miniconda3
```

<mark style="color:green;">**Downloading the installer:**</mark>

{% code overflow="wrap" %}

```bash
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
```

{% endcode %}

<mark style="color:green;">**Running the installer**</mark>

```bash
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
```

<mark style="color:green;">**Cleaning up**</mark>

```bash
rm -rf ~/miniconda3/miniconda.sh
```

* Post-installation, initializing Miniconda for bash and zsh shells is recommended:

<mark style="color:yellow;">`~/miniconda3/bin/conda init bash`</mark>

<mark style="color:green;">Additional Notes</mark>

* Graphical installer instructions and hash-checking guidelines are available in the Miniconda documentation.
* For specific system setups or advanced configurations, refer to the detailed Miniconda documentation.

This summary provides an overview of Miniconda’s key features, installation process, and system requirements, suitable for inclusion in GitBook documentation.

</details>

<mark style="color:blue;">**Create Environment**</mark>

* Axolotl <mark style="color:blue;">**requires**</mark> <mark style="color:blue;"></mark><mark style="color:blue;">Pytorch</mark> <mark style="color:yellow;">>=</mark><mark style="color:blue;">3.9</mark> <mark style="color:blue;">and Pytorch</mark> <mark style="color:yellow;">>=</mark><mark style="color:blue;">2.0</mark>
* We are going to use <mark style="color:yellow;">Python 3.10 - which is a compromise between the latest Python and the minimum Python environment required</mark>
* In this case, we want to create a conda environment with a specific version of Python, so <mark style="color:red;">**do not**</mark> use the `conda create --clone` command to <mark style="color:purple;">create a clone of the</mark> <mark style="color:purple;"></mark><mark style="color:purple;">**`base`**</mark><mark style="color:purple;">**&#x20;**</mark><mark style="color:purple;">**environment.**</mark>

<details>

<summary>When <mark style="color:yellow;">should</mark> you clone a base environment?</summary>

The `base` environment in Conda is the default environment that gets created when you install Conda. It is a special environment that <mark style="color:blue;">contains the Conda package manager itself, along with a collection of installed packages</mark>.&#x20;

#### <mark style="color:blue;">What is the Base Environment?</mark>

1. **Default Environment**:
   * The `base` environment is the default environment active upon the installation of Anaconda or Miniconda. It is where Conda itself, along with a set of pre-installed packages and Python, are located.
2. **Contains Conda**:
   * Crucially, the `base` environment includes the Conda package manager. This means you can use Conda to create and manage other environments from the `base` environment.
3. **Pre-installed Packages**:
   * When you install Anaconda, the `base` environment comes with a wide array of commonly used data science packages pre-installed. This makes it a ready-to-use environment for a variety of tasks.

#### <mark style="color:blue;">Why Do People Clone the Base Environment?</mark>

1. **Consistency**:
   * Cloning the `base` environment ensures a consistent starting point. You get a new environment with the same set of packages and configurations as the `base`, which is often a known, stable setup.
2. **Safety**:
   * Working directly in the `base` environment is generally discouraged because changes might affect the stability and functionality of the Conda system itself. Cloning it provides a safe playground where one can install, update, or remove packages without risking the integrity of the Conda installation.
3. **Ease of Setup**:
   * For users who want a quick setup that includes a broad range of pre-installed packages (like those found in Anaconda), cloning the `base` environment is a time-saver. It eliminates the need to manually install a long list of common packages.
4. **Replicability**:
   * Cloning can be used to replicate the `base` environment across different machines or for different users, ensuring that everyone is working with the same set of tools and libraries.
5. **Experimentation and Testing**:
   * Cloning the `base` environment to create a new one allows for experimentation and testing with different package versions or configurations. If something goes wrong, the `base` environment remains unaffected.

</details>

* Use the following command to create an environment named 'axolotl' the contains the base libraries

```bash
conda create -n axolotl python=3.10
```

Once the environment has been created, <mark style="color:yellow;">activate the environment:</mark>

```bash
conda activate axolotl
```

To ensure that there are <mark style="color:yellow;">no installed packages in the environment</mark>, enter the command:

```bash
pip3 freeze
```

To ensure we only download fresh copies of packages and do not use packages in the cache, enter this command. &#x20;

```bash
pip3 cache purge
```

Because the environment may be empty; we need to install lit and cmake to install from wheels.

The purpose of running `pip install lit cmake` is to <mark style="color:yellow;">set up a development environment with specific tools needed for compiling, building, testing, or packaging software</mark>, particularly in projects that might involve C/C++ code or require complex build configurations.

```bash
pip3 install lit cmake
```

The output should be similar to this:

```bash
Successfully installed cmake-3.29.2 lit-18.1.3
```

<details>

<summary><mark style="color:green;">What is 'wheels'?</mark>   <mark style="color:purple;">Python Package Format</mark></summary>

<mark style="color:blue;">**Definition**</mark><mark style="color:blue;">:</mark> A wheel is a built package format for Python, denoted with a `.whl` file extension. It's a packaging standard introduced by PEP 427, <mark style="color:yellow;">designed to replace the older</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">`.egg`</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">format.</mark>

<mark style="color:blue;">**Advantages**</mark><mark style="color:blue;">:</mark> Wheels are more efficient than the traditional <mark style="color:yellow;">**`setup.py`**</mark> script for installing Python packages.&#x20;

The key benefits include:

<mark style="color:blue;">**Faster Installation**</mark><mark style="color:blue;">:</mark> <mark style="color:yellow;">Wheels are pre-built distributions</mark>, meaning <mark style="color:yellow;">the package does not need to be compiled on the end user's machine.</mark> This significantly speeds up the installation process, especially for packages that contain compiled extensions.

<mark style="color:blue;">**Consistency**</mark><mark style="color:blue;">:</mark> They provide a more consistent installation experience since they include pre-built binaries. This reduces issues stemming from variations in build environments.

<mark style="color:blue;">**Usage**</mark><mark style="color:blue;">:</mark> When you run <mark style="color:yellow;">`pip install package_name`</mark>, `pip` tries to find and download a wheel compatible with your platform and Python version. If it can't find a suitable wheel, it falls back to installing from the source distribution, which can be slower as it may require compiling code.

</details>

{% hint style="info" %}
If you <mark style="color:yellow;">make an error</mark> building your conda environment, you can <mark style="color:yellow;">delete it and start again.</mark>  Follow the instructions below to delete your conda environment and start again:
{% endhint %}

<details>

<summary><mark style="color:green;">How to delete a conda environment</mark></summary>

To delete a Conda environment, you can follow these steps:

Open your terminal

First deactivate any conda environment you may already be in:

```bash
conda deactivate
```

Activate the base environment (or any other environment you're not deleting) to ensure you don't accidentally delete the active environment. You can do this by running:

```csharp
conda activate base
```

To list all your Conda environments, use the following command:

```bash
conda env list
```

This will display a list of all the environments installed on your system along with their paths.  You should see the axolotl virtual environment.

To <mark style="color:purple;">delete the axolotl environment</mark>, use the following command:

```bash
conda env remove --name axolotl
```

If asked, confirm the deletion by <mark style="color:yellow;">typing</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">`y`</mark> and pressing Enter when prompted.

The Conda environment will be deleted, and all its packages and dependencies will be removed from your system.

</details>

{% hint style="info" %}
Always make sure you select the Python interpreter in VSCode to the <mark style="color:yellow;">axolotl virtual environment</mark>
{% endhint %}

### <mark style="color:blue;">The Python Interpreter</mark>

Remember, when using Visual Studio Code with virtual environments, make sure that you ensure that you have <mark style="color:yellow;">set the Python Interpreter to your virtual environment</mark>

<mark style="color:green;">**Access the Command Palette**</mark>

Open VS Code, and from the menu, go to View > Command Palette, or use the keyboard shortcut Ctrl+Shift+P.

<figure><img src="/files/ss3WVP1BEkEFvRtRWZn2" alt=""><figcaption></figcaption></figure>

<mark style="color:yellow;">The</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">**Python: Select Interpreter**</mark> command displays a list of available global environments, conda environments, and virtual environments.  &#x20;

The following image, for example, shows several Anaconda  installations along with a conda environment and a virtual environment (`env`) that's located within the workspace folder:

1. Open VS Code, and from the menu, go to <mark style="color:yellow;">View > Command Palette,</mark> or use the keyboard shortcut <mark style="color:yellow;">Ctrl+Shift+P.</mark>
2. In the Command Palette, type <mark style="color:yellow;">"Python: Select Interpreter"</mark> and select this command when it appears in the list.

<figure><img src="/files/oaXQk8vdYq4nfQfIToWY" alt=""><figcaption><p>Python: Select Interpreter</p></figcaption></figure>

{% hint style="warning" %}
Make sure you <mark style="color:yellow;">**select the axolotl virtual environment**</mark> we have created.
{% endhint %}

<details>

<summary>Reference: <mark style="color:green;">Python Interpreter</mark></summary>

To select an interpreter for Python and ensure your Conda virtual environment works properly in Visual Studio Code (VS Code):

**Access the Command Palette**

Open VS Code, and from the menu, go to View > Command Palette, or use the keyboard shortcut Ctrl+Shift+P.

**Use the Python: Select Interpreter Command**

In the Command Palette, type "Python: Select Interpreter" and select this command when it appears in the list. Alternatively, you can use the keyboard shortcut Ctrl+Shift+P to bring up the Command Palette and directly type "Python: Select Interpreter."

**Select an Interpreter**

Upon selecting the "Python: Select Interpreter" command, a list of available Python environments will be displayed. This list includes global environments, Conda environments, and virtual environments. You can choose from the listed environments. The list will look similar to the example shown in the documentation.

**Remember Your Selection**

**I**f you have a folder or workspace open in VS Code when you select an interpreter, the Python extension will remember your choice for that specific workspace. This means that the same interpreter will be used when you reopen that workspace in the future.

**Status Bar Confirmation**

After selecting an interpreter, the selected environment version will be <mark style="color:yellow;">displayed on the right side of the Status Bar in VS Code</mark>. This serves as a confirmation that the interpreter has been set.

**Optional**

**Prevent Automatic Activation:** By default, VS Code will <mark style="color:yellow;">automatically activate the selected environment when you open a terminal within VS Code</mark>. If you want to prevent this automatic activation, you can add `"python.terminal.activateEnvironment": false` to your settings.json file.

**Manually Specify an Interpreter (if needed)**

If <mark style="color:yellow;">VS Code doesn't automatically locate an interpreter you want to use, you can manually specify it.</mark> To do this, run the "Python: Select Interpreter" command and choose the "Enter interpreter path..." option from the top of the interpreters list. You can then either enter the full path to the Python interpreter directly or browse your file system to find it.

<mark style="color:yellow;">Selecting the Python interpreter in Visual Studio Code (VS Code) is necessary</mark> because there are <mark style="color:yellow;">various Python environments and installations that can coexist on a system.</mark>&#x20;

The reason why it can't always be done automatically is due to the complexity of managing and identifying Python environments, especially in cases where multiple versions or virtual environments are present. Here are some key reasons:

**User Preferences:** Developers may have their preferred Python interpreter for different types of projects. For example, they may want to use Python 3.8 for one project and Python 3.9 for another. Automatic selection might not align with their preferences.

**Path Variations:** Python interpreters can be located in various directories, and their paths may not always follow a consistent pattern. Manually selecting the interpreter allows users to specify the exact path.

**Conda Environments:** Conda is a package manager that creates isolated environments for projects. Detecting and managing Conda environments automatically can be complex due to the additional layer of isolation.

**Compatibility:** Python extensions and packages can have compatibility issues with certain Python versions. Manually selecting the interpreter allows users to ensure compatibility with their specific project requirements.

</details>

### <mark style="color:blue;">Pytorch Installation</mark>

### <mark style="color:yellow;">--> The environment should contain Pytorch 2.0</mark>

Go to the Pytorch website and select your variables - version, operating system, package (Conda), Language, Computer Platform.  It will provide you the command to run in your activated Conda environment.

<https://pytorch.org/get-started/locally/>

<figure><img src="/files/oLqwrLKSHFOOGDzp8fX6" alt=""><figcaption><p>Enter in the required details</p></figcaption></figure>

Then take the output provided and go into your terminal and enter the command.  <mark style="color:yellow;">Ensure that you are within the conda environment 'axolotl'</mark> (conda activate axolotl) if you are not enter the command:

```bash
conda activate axolotl
```

The command below is for installation of <mark style="color:yellow;">Pytorch 2.1</mark> on a Linux terminal using Python and <mark style="color:yellow;">CUDA 12.1</mark>

{% code overflow="wrap" fullWidth="false" %}

```bash
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia
```

{% endcode %}

<mark style="color:yellow;">Press (y)</mark> when asked to download the libraries suggested/

This is a breakdown of the command:

* <mark style="color:yellow;">**`conda install`**</mark><mark style="color:yellow;">:</mark> This indicates that you're using Conda
* <mark style="color:yellow;">**`pytorch`**</mark><mark style="color:yellow;">:</mark> This is the main PyTorch library
* <mark style="color:yellow;">**`torchvision`**</mark><mark style="color:yellow;">:</mark> A package for PyTorch that provides utilities for image and video processing
* <mark style="color:yellow;">**`torchaudio`**</mark><mark style="color:yellow;">:</mark> A package for PyTorch tailored for audio processing
* <mark style="color:yellow;">**`pytorch-cuda=12.1`**</mark><mark style="color:yellow;">:</mark> This specifies that you want the PyTorch build that is compatible with CUDA 12.1. &#x20;
* <mark style="color:yellow;">**`-c pytorch`**</mark><mark style="color:yellow;">:</mark> This tells Conda to install from the <mark style="color:yellow;">**`pytorch`**</mark> channel on Anaconda Cloud.
* <mark style="color:yellow;">**`-c nvidia`**</mark><mark style="color:yellow;">:</mark> This specifies to also use the <mark style="color:yellow;">**`nvidia`**</mark> channel, which is necessary for CUDA-related packages.&#x20;

### <mark style="color:green;">Instruction</mark>

When the pytorch installation command is executed, the terminal will ask if you want to install all of the Pytorch packages <mark style="color:yellow;">- reply yes.</mark>

By following these steps, you should be able to successfully install the Axolotl library on your Ubuntu 20.04 version.

<details>

<summary>Reference: <mark style="color:green;">How does Pytorch relate to CUDA?</mark></summary>

* <mark style="color:yellow;">PyTorch</mark> <mark style="color:blue;">**is an open-source machine learning library**</mark> for building and training neural networks.
* <mark style="color:yellow;">PyTorch</mark> <mark style="color:blue;">**uses CUDA to accelerate its operations on NVIDIA GPUs**</mark>. This acceleration is particularly crucial for training large and complex deep learning models, a process that can be computationally intensive and time-consuming.

<mark style="color:green;">**How They Work Together**</mark>

* <mark style="color:blue;">**Acceleration of Tensor Operations**</mark><mark style="color:blue;">:</mark> PyTorch performs a large number of operations on tensors, which are multi-dimensional arrays. *<mark style="color:yellow;">**CUDA accelerates these operations**</mark>* when executed on NVIDIA GPUs.
* <mark style="color:blue;">**Transparent Usage**</mark><mark style="color:blue;">:</mark> PyTorch *<mark style="color:yellow;">**abstracts the complexity of CUDA,**</mark>* making it easier for users. Developers can write standard PyTorch code, and if a CUDA-enabled GPU is available, PyTorch will automatically use CUDA to accelerate operations.
* <mark style="color:blue;">**CUDA Tensors**</mark><mark style="color:blue;">:</mark> PyTorch introduces 'CUDA tensors', similar to normal tensors but located in the GPU's memory. Operations on CUDA tensors are performed on the GPU, offering significant speed improvements.

<mark style="color:green;">**How to Think About Their Interaction**</mark>

* <mark style="color:blue;">**Complementary Roles**</mark><mark style="color:blue;">:</mark> Think of CUDA as an underlying engine that provides horsepower to PyTorch. While PyTorch handles the creation and manipulation of tensors and neural networks, CUDA provides the capability to execute these operations quickly on a GPU.
* <mark style="color:blue;">**Ease of Development**</mark><mark style="color:blue;">:</mark> From a developer’s perspective, CUDA’s complexities are mostly hidden. You write PyTorch code as usual, and PyTorch, coupled with CUDA, takes care of efficiently executing operations on the GPU.
* <mark style="color:blue;">**Scalability and Performance**</mark><mark style="color:blue;">:</mark> In environments where training large models or processing large datasets is required, the CUDA-PyTorch integration becomes critical. It allows leveraging GPU acceleration for better performance, reducing training time from days to hours or hours to minutes.
* <mark style="color:blue;">**Conditional Usage**</mark><mark style="color:blue;">:</mark> CUDA is utilized by PyTorch only if a compatible NVIDIA GPU is available. Otherwise, PyTorch defaults to using the CPU.

</details>

### <mark style="color:blue;">Install "Packaging" Package</mark>

Now install the <mark style="color:yellow;">**`packaging`**</mark> and <mark style="color:yellow;">**ninja**</mark> python packages.&#x20;

The <mark style="color:yellow;">**`packaging`**</mark> library provides utilities for version handling, specifiers, markers, requirements, tags, and more, which are often used in package management and installation.

<pre class="language-bash"><code class="lang-bash"><a data-footnote-ref href="#user-content-fn-1">pip3</a> install <a data-footnote-ref href="#user-content-fn-2">packaging</a> ninja
</code></pre>

### <mark style="color:blue;">**Run the Axolotl setup.py file**</mark>

* Ensure that you are in the <mark style="color:yellow;">base directory</mark> that contains the [`setup.py`](/creation-of-environment/setup.py-objectives.md) file.
* Run the installation command:

```bash
pip3 install -e '.[flash-attn,deepspeed]'
```

### <mark style="color:blue;">**`-e`**</mark><mark style="color:blue;">**&#x20;**</mark><mark style="color:blue;">**flag**</mark>

Stands for "editable" mode. When you install a package in editable mode, Python installs the package in a way that allows you to *<mark style="color:yellow;">**modify the package source code**</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">and see the changes directly without needing to reinstall the package</mark>*. This is particularly useful for development purposes.

<mark style="color:blue;">**`This is important`**</mark>

**`The`` `**<mark style="color:yellow;">**`dot "."`**</mark> <mark style="color:yellow;">represents the current directory</mark>. In this context, it indicates that <mark style="color:yellow;">`pip`</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">should install the package located in the current directory.</mark>&#x20;

This directory contains the <mark style="color:yellow;">**`setup.py`**</mark> file, which is the build script for setuptools. It tells setuptools about your package (such as the name and version) and the files that belong to it.

<mark style="color:blue;">**`[flash-attn,deepspeed]`**</mark>

These are extras.&#x20;

Extras are *<mark style="color:yellow;">additional dependencies that are relevant for optional features of the package</mark>*.&#x20;

In this case, <mark style="color:yellow;">**`flash-attn`**</mark> and <mark style="color:yellow;">**`deepspeed`**</mark> are optional dependencies. When you specify these extras, <mark style="color:yellow;">**`pip`**</mark> will also install the dependencies associated with these features as defined in the package's <mark style="color:yellow;">**`setup.py`**</mark>.

This command will install the Axolotl package along with its dependencies as specified in <mark style="color:yellow;">**`setup.py`**</mark>`.`

{% hint style="info" %}
You can find a detailed explanation of the <mark style="color:blue;">**setup.py script**</mark> here:  [<mark style="color:yellow;">**setup.py decomposition**</mark>](/creation-of-environment/setup.py-objectives.md)
{% endhint %}

<details>

<summary><mark style="color:green;">setup.py installation process</mark></summary>

<mark style="color:green;">**Key Actions**</mark>

<mark style="color:blue;">**Installation Process**</mark><mark style="color:blue;">:</mark> The script is executing the installation of the Axolotl package. This involves several steps:

* <mark style="color:blue;">**Running Egg Info**</mark><mark style="color:blue;">:</mark> Gathering package information and writing it to the `axolotl.egg-info` directory.
* <mark style="color:blue;">**Building Distribution**</mark>: Creating a build directory and compiling the package into an 'egg' format, a type of Python distribution.
* <mark style="color:blue;">**Copying and Installing**</mark><mark style="color:blue;">:</mark> The compiled package is then copied to your Miniconda environment's `site-packages` directory.
* <mark style="color:blue;">**Processing Dependencies**</mark><mark style="color:blue;">:</mark> The script is also handling dependencies, downloading, and installing them as required. This includes packages like `gcsfs`, `s3fs`, `tensorboard`, `gradio`, `fschat`, `art`, `pynvml`, `scikit-learn`, `scipy`, `rouge-score`, and `evaluate`.

#### <mark style="color:green;">Warnings and Recommendations</mark>

1. <mark style="color:blue;">**Deprecation of**</mark><mark style="color:blue;">**&#x20;**</mark><mark style="color:blue;">**`setup.py install`**</mark><mark style="color:blue;">:</mark> The script warns that `setup.py install` is deprecated. This is a Setuptools deprecation warning, encouraging the use of more modern, standards-based tools like `pypa/build` and `pypa/installer`.&#x20;
   * **Implication**: While `setup.py install` still works, *<mark style="color:purple;">**it's recommended to use newer tools for package building and installation in future projects**</mark>* to stay aligned with Python packaging standards.
2. <mark style="color:blue;">**Deprecation of**</mark>**&#x20;**<mark style="color:yellow;">**`easy_install`**</mark>: Similar to the previous warning, there’s a notice about the deprecation of <mark style="color:yellow;">`easy_install`</mark>, a legacy package installation method.

#### <mark style="color:green;">Installation Summary</mark>

* The Axolotl library and its dependencies are being installed into your specified conda environment.
* The package is being built and installed using an older method (<mark style="color:yellow;">`setup.py install`</mark>), which, although operational, is not recommended due to deprecation warnings.
* The process involves the creation of a package distribution in the 'egg' format and the installation of various dependencies listed in the package's `requirements.txt` file.

</details>

### <mark style="color:blue;">Install Logging</mark>

We will be using Weights and Biases for training logging. &#x20;

Weights and Biases <mark style="color:yellow;">should already be installed,</mark> if it is not - to install Weights and Biases:

```bash
pip3 install wandb
```

<mark style="color:green;">**Login to Weights and Biases**</mark>

```
wandb login
```

When asked, complete the following fields for your Weights and Biases account:

<mark style="color:blue;">Username:</mark>&#x20;

<mark style="color:blue;">Password:</mark> &#x20;

<mark style="color:blue;">API Token:</mark>&#x20;

<details>

<summary>References: <mark style="color:green;">Weights and Biases</mark></summary>

\
To set up Weights and Biases (W\&B) for tracking experiments, follow these steps:

<mark style="color:blue;">**Create an Account and Install W\&B**</mark>

* Sign up for a free account at <https://wandb.ai/site> and log in to your W\&B account.
* Install the W\&B library on your machine in a Python 3 environment using pip:

```bash
pip install wandb
```

<mark style="color:blue;">**Log in to W\&B**</mark>

* In your Python script or notebook, import the W\&B Python SDK and log in using <mark style="color:yellow;">`wandb.login()`</mark>. You will be prompted to provide your API key.

```python
import wandb
wandb.login()
```

<mark style="color:blue;">**Start a Run and Track Hyperparameters**</mark>

* Initialise a W\&B Run object with <mark style="color:yellow;">`wandb.init()`</mark><mark style="color:yellow;">.</mark> You can specify project information and track hyperparameters in the configuration.

```python
run = wandb.init(
    project="my-awesome-project",
    config={
        "learning_rate": 0.01,
        "epochs": 10,
    },
)
```

A run is a fundamental unit in W\&B used to track metrics, logs, jobs, and more.

**Example Script**: Below is an example script that demonstrates how to use W\&B for tracking hyperparameters and metrics in a training script.

```python
import wandb
import random

wandb.login()

epochs = 10
lr = 0.01

run = wandb.init(
    project="my-awesome-project",
    config={
        "learning_rate": lr,
        "epochs": epochs,
    },
)

offset = random.random() / 5
print(f"lr: {lr}")

for epoch in range(2, epochs):
    acc = 1 - 2**-epoch - random.random() / epoch - offset
    loss = 2**-epoch + random.random() / epoch + offset
    print(f"epoch={epoch}, accuracy={acc}, loss={loss}")
    wandb.log({"accuracy": acc, "loss": loss})

# run.log_code()
```

<mark style="color:blue;">**View Results**</mark><mark style="color:blue;">:</mark> After running your script, you can view the tracked metrics, including accuracy and loss, in the W\&B App at <https://wandb.ai/home>.

<mark style="color:blue;">**What's Next?**</mark><mark style="color:blue;">:</mark> Explore other features of the W\&B ecosystem, such as integrations, reports, artifacts, sweeps, and more to enhance your machine learning experiments.

Additionally, the documentation provides answers to common questions, such as finding your API key, using W\&B in automated environments, setting up local installations, and temporarily disabling W\&B logging.

</details>

<details>

<summary>Reference: <mark style="color:green;">Weights and Biases python libraries</mark></summary>

When you installed the Weights and Biases (W\&B) library using the <mark style="color:yellow;">`pip install wandb`</mark> command, the following libraries were installed along with W\&B:

<mark style="color:blue;">**Click**</mark><mark style="color:blue;">:</mark> A library for creating command-line interfaces.

<mark style="color:blue;">**GitPython**</mark><mark style="color:blue;">:</mark> A Python library to interact with Git repositories.

<mark style="color:blue;">**psutil**</mark><mark style="color:blue;">:</mark> A cross-platform library for retrieving information on system utilization.

<mark style="color:blue;">**sentry\_sdk**</mark><mark style="color:blue;">:</mark> A Python SDK for Sentry, which helps with error tracking and monitoring.

<mark style="color:blue;">**docker-pycreds**</mark><mark style="color:blue;">:</mark> A library for working with Docker credentials.

<mark style="color:blue;">**setproctitle**</mark><mark style="color:blue;">:</mark> A library for setting the process title on Unix platforms.

<mark style="color:blue;">**appdirs**</mark><mark style="color:blue;">:</mark> A library for determining appropriate platform-specific directories.

<mark style="color:blue;">**protobuf**</mark><mark style="color:blue;">:</mark> A library for working with Protocol Buffers, used for efficient data serialization.

<mark style="color:blue;">**six**</mark><mark style="color:blue;">:</mark> A Python 2 and 3 compatibility library.

<mark style="color:blue;">**gitdb**</mark><mark style="color:blue;">:</mark> A library for interacting with Git object databases.

<mark style="color:blue;">**smmap**</mark><mark style="color:blue;">:</mark> A library for managing sparse memory-mapped files.

These libraries are dependencies that are required by W\&B to function properly.&#x20;

They help provide various functionalities, including version control, system monitoring, and efficient data serialisation, which are useful for tracking machine learning experiments effectively using W\&B.

</details>

<details>

<summary><mark style="color:green;">What are the main classes within the Transformer Library?</mark></summary>

### <mark style="color:purple;">The main classes within the library are:</mark>

1. <mark style="color:blue;">**Model Classes**</mark><mark style="color:blue;">:</mark> These classes correspond to specific transformer models. Each model class in the library is usually named after the model it represents, such as `BertModel` for BERT, `GPT2Model` for GPT-2, `T5Model` for T5, etc. These classes are used to <mark style="color:yellow;">i</mark>nstantiate model architectures<mark style="color:yellow;">,</mark> either from pretrained weights or from scratch.
2. <mark style="color:blue;">**Tokenizer Classes**</mark><mark style="color:blue;">:</mark> Tokenizers are responsible for preprocessing text input for transformer models. They <mark style="color:yellow;">convert text into a format that is understandable by the model.</mark> Classes like `BertTokenizer` and `GPT2Tokenizer` handle this task. There are also "fast" tokenizers (e.g., `BertTokenizerFast`), which are optimized for speed and provide additional functionalities.
3. <mark style="color:blue;">**Configuration Classes**</mark>: Configuration classes store the <mark style="color:yellow;">configurations of a model</mark> — for example, `BertConfig`. These classes contain all the settings necessary to build a model, such as the <mark style="color:yellow;">number of layers, hidden units, and attention heads</mark>.
4. <mark style="color:blue;">**Pipeline Classes**</mark><mark style="color:blue;">:</mark> The pipeline class abstracts away much of the <mark style="color:yellow;">preprocessing and postprocessing work involved in using models</mark>. For instance, the <mark style="color:yellow;">`pipeline`</mark> function allows users to perform tasks like text classification, question answering, and text generation using a simple API.
5. <mark style="color:blue;">**Trainer Class**</mark><mark style="color:blue;">:</mark> The `Trainer` class provides an easy-to-use <mark style="color:yellow;">interface for training, evaluating, and fine-tuning transformer models</mark>. It handles many training details like data collation, optimization, and logging.
6. <mark style="color:blue;">**Data Collator Classes**</mark><mark style="color:blue;">:</mark> These are utility classes used to collate batches of data. They are especially useful when dealing with variable-sized input, such as in language modeling tasks.
7. <mark style="color:blue;">**Optimizer and Scheduler Classes**</mark><mark style="color:blue;">:</mark> While not exclusive to Transformers, these classes are often used in conjunction with model training. They define the <mark style="color:yellow;">optimization algorithm and learning rate schedule</mark> used during training.

</details>

#### <mark style="color:purple;">If you are interested in understanding the libraries and processes installed via the setup.py script, please review the following pages.</mark>

[^1]: `pip3` is specifically designed for managing packages in Python 3 environments. It ensures that any packages you install or manage using it are for Python 3.

[^2]: * The `packaging` library in Python provides utilities for version handling, specifiers, markers, requirements, tags, and other functionalities related to package management.
    * It is often used in projects that need to parse, compare, or handle versions and requirements of Python packages.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://axolotl.continuumlabs.pro/creation-of-environment/platform-installation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
