# Downloading the model

### <mark style="color:blue;">**Access HuggingFace Hub**</mark>

* Visit the [HuggingFace hub](https://huggingface.co/models).
* Use the search bar to look for "Llama v3" models.

<figure><img src="https://148429626-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgmygEk0ifLuns7P9aMW%2Fuploads%2FZdZMW5lsfCKlrGbQ24oi%2Fimage.png?alt=media&#x26;token=510872ae-f064-4f41-9da8-0e1f8adc5792" alt=""><figcaption><p>The full suite of Meta models</p></figcaption></figure>

### <mark style="color:blue;">**Identify the Correct Models**</mark>

* Click on the Meta Llama3 collection

<figure><img src="https://148429626-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgmygEk0ifLuns7P9aMW%2Fuploads%2FSpbY6K9FvBwRER8xx9VC%2Fimage.png?alt=media&#x26;token=09ad777b-9976-49a5-aa7e-e6b2d51f0638" alt=""><figcaption></figcaption></figure>

We are downloading Meta-Llama-3-8B.

{% hint style="info" %}
If you have not done already, you will have to get permission to download the Llama3 models.  It is a simple process.
{% endhint %}

We note that this model has not yet been converted to Huggingface weights.

* Models with 'hf' in their name are already converted to HuggingFace checkpoints. This means they are ready to use and require no further conversion.

### <mark style="color:blue;">Download the model into your model directory</mark>

First, we must connect to Huggingface.

If you have not done already, create a repository to store a Huggingface token

```bash
git config --global credential.helper store
```

Then connect to Huggingface

```bash
huggingface-cli login
```

When asked to enter the authentication code, use:

```bash
Your Huggingface token....
```

<details>

<summary><mark style="color:green;">What is a Huggingface User Token</mark></summary>

User Access Tokens on the Hugging Face platform serve as a secure method for authenticating applications and notebooks to access Hugging Face services.&#x20;

#### <mark style="color:green;">Purpose of User Access Tokens</mark>

* **Preferred Authentication Method**: They are the recommended way to authenticate an application or notebook to Hugging Face services.
* **Management**: Managed through the user's settings on the Hugging Face platform.

#### <mark style="color:green;">Scope and Roles</mark>

* **Read Role**: Tokens with this role provide read access to both public and private repositories that the user or their organization owns. They are suitable for tasks like <mark style="color:yellow;">downloading private models or performing inference.</mark>
* **Write Role**: In addition to read privileges, tokens with the write role can modify content in repositories where the user has write access. This includes creating or updating repository content, such as training or modifying a model card<mark style="color:yellow;">.</mark>

#### <mark style="color:green;">Managing User Access Tokens</mark>

* **Creation**: Access tokens are created in the user's settings under the Access Tokens tab.
* **Customization**: Users can select a role and name for each token.
* **Control**: Tokens can be deleted or refreshed for security purposes.

#### <mark style="color:green;">Usage of User Access Tokens</mark>

* **Versatility**: Can be used in various ways, including:
  * <mark style="color:blue;">As a replacement for a password for Git operations or basic authentication.</mark>
  * <mark style="color:blue;">As a bearer token for calling the Inference API.</mark>
  * Within Hugging Face Python libraries, like `transformers` or `datasets`, by passing the token for accessing private models or datasets.
* **Security Warning**: Users are cautioned to safeguard their tokens to prevent unauthorized access to their private repositories.

#### <mark style="color:green;">Best Practices</mark>

* **Separate Tokens for Different Uses**: Create distinct tokens for different applications or contexts (e.g., local machine, Colab notebook, custom servers).
* **Role Appropriateness**: Assign only the necessary role to each token. If only read access is needed, limit the token to the read role.
* **Token Management**: Regularly rotate and manage tokens to ensure security, especially if a token is suspected to be compromised.

</details>

When asked whether you want to add your Huggingface token  as a <mark style="color:yellow;">**git token credential,**</mark> say yes.  This should be the output:

```
Token is valid (permission: read).
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /home/paperspace/.cache/huggingface/token
Login successful
```

The install git large file storage so we can download the model (which is large)

```bash
git lfs install
```

If successful, the following output will be displayed in your terminal

```
Updated git hooks.
Git LFS initialized.
```

<details>

<summary><mark style="color:green;"><strong>What is Git large file storage</strong></mark></summary>

\
`Git Large File Storage (LFS)` is an extension for Git that addresses issues related to handling large files and binary data in Git repositories.

#### <mark style="color:green;">Why Git LFS is Necessary</mark>

1. **Limitations with Large Files in Git**:
   * Standard Git is excellent for handling text files (like code), which are typically small and benefit from Git's delta compression (storing only changes between versions). However, <mark style="color:yellow;">Git isn't optimized for large binary files (like images, videos, datasets, etc.)</mark><mark style="color:blue;">.</mark> These files can dramatically increase the size of the repository and degrade performance.
2. **Efficiency and Performance**:
   * Cloning and pulling from a repository with large files can be slow, consume a lot of bandwidth, and require significant storage space on every developer's machine. This inefficiency can hinder collaboration and development speed.
3. **Repository Bloat**:
   * In standard Git, every version of every file is stored in the repository's history. This means that even if a large file is deleted from the current working directory, its history still resides in the Git database, leading to a bloated repository.

#### <mark style="color:green;">How Git LFS Works</mark>

1. **Pointer Files**:
   * Git LFS replaces large files in your repository with tiny pointer files. When you commit a large file, Git LFS stores a reference (pointer) to that file in your repository, while the actual file data is stored in a separate server-side LFS storage.
2. **LFS Storage**:
   * The large file contents are stored on a remote server configured for LFS, typically alongside your Git repository hosting service (like GitHub, GitLab, Bitbucket, etc.). This storage is separate from your main Git repository storage.
3. **Version Tracking**:
   * Git LFS tracks versions of the large files separately. Every time you push or pull changes, Git LFS uploads or downloads the correct version of the large file from the LFS server, ensuring that you have the correct files in your working copy.
4. **Selective Download**:
   * When cloning or pulling a repository, Git LFS downloads only the versions of large files needed for your current commit, reducing the time and bandwidth compared to downloading the entire history of every file.
5. **Compatibility**:
   * Git LFS is compatible with existing Git services and workflows. It's an extension to Git, so you use the same Git commands. Repositories using Git LFS are still standard Git repos, ensuring backward compatibility.

</details>

Create a models directory

```bash
mkdir models
```

Move into the Axolotl models directory:

```bash
cd models
```

Enter the git clone command:

```bash
git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B
```

The outcome should be new folder called LLama3-8b with all the appropriate files in it.

This is a screen cut from VS Code, showing all of the model files.

<figure><img src="https://148429626-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgmygEk0ifLuns7P9aMW%2Fuploads%2FtLBAYCaQxdlLQWkyjXGc%2Fimage.png?alt=media&#x26;token=bb149a76-2190-49be-91bc-2e7f6b7acc19" alt=""><figcaption><p>Snippet from VS Code showing the downloaded model files</p></figcaption></figure>
