LogoLogo
Continuum Knowledge BankContinuum Applications
  • Introduction
  • Creation of Environment
    • Platform Installation
    • Axolotl Dependencies
    • setup.py objectives
      • script analysis
  • Huggingface Hub
  • Download the dataset
    • Types of Dataset Structures
    • Structuring Datasets for Fine-Tuning Large Language Models
    • Downloading Huggingface Datasets
    • Use Git to download dataset
    • Popular Datasets
    • Download cleaned Alpaca dataset
    • Template-free prompt construction
  • Downloading models
    • Phi 2.0 details
    • Downloading Phi 2.0
    • Available Models
  • Configuration for Training
  • Datasets
  • Model Selection - General
  • Phi 2.0
    • Phi 2.0 - Model Configuration
    • Phi 2.0 - Model Quantization
    • Phi 2.0 - Data Loading and Paths
    • Phi 2.0 - Sequence Configuration
    • Phi 2.0 - Lora Configuration
    • Phi 2.0 - Logging
    • Phi 2.0 - Training Configuration
    • Phi 2.0 - Data and Precision
    • Phi 2.0 - Optimisations
    • Phi 2.0 - Extra Hyperparameters
    • Phi 2.0 - All Configurations
    • Phi 2.0 - Preprocessing
    • Phi 2.0 - Training
    • Uploading Models
  • Llama2
    • Llama2 - Model Configuration
    • Llama2 - Model Quantization
    • Llama2 - Data Loading and Paths
    • Llama2 - Sequence Configuration
    • Llama2 - Lora Configuration
    • Llama2 - Logging
    • Llama2 - Training Configuration
    • Llama2 - Data and Precision
    • Llama2 - Optimisations
    • Llama2 - Extra Hyperparameters
    • Llama2- All Configurations
    • Llama2 - Training Configuration
    • Llama2 - Preprocessing
    • Llama2 - Training
  • Llama3
    • Downloading the model
    • Analysis of model files
      • Model Analysis - Configuration Parameters
      • Model Analysis - Safetensors
      • Tokenizer Configuration Files
        • Model Analysis - tokenizer.json
        • Model Analysis - Special Tokens
    • Llama3 - Model Configuration
    • Llama3 - Model Quantization
    • Llama3 - Data Loading and Paths
    • Llama3 - Sequence Configuration
    • Llama3 - Lora Configuration
    • Llama3 - Logging
    • Llama3 - Training Configuration
    • Llama3 - Data and Precision
    • Llama3 - Optimisations
    • Llama3 - Extra Hyperparameters
    • Llama3- All Configurations
    • Llama3 - Preprocessing
    • Llama3 - Training
    • Full Fine Tune
  • Special Tokens
  • Prompt Construction for Fine-Tuning Large Language Models
  • Memory-Efficient Fine-Tuning Techniques for Large Language Models
  • Training Ideas around Hyperparameters
    • Hugging Face documentation on loading PEFT
  • After fine tuning LLama3
  • Merging Model Weights
  • Merge Lora Instructions
  • Axolotl Configuration Files
    • Configuration Options
    • Model Configuration
    • Data Loading and Processing
    • Sequence Configuration
    • Lora Configuration
    • Logging
    • Training Configuration
    • Augmentation Techniques
  • Axolotl Fine-Tuning Tips & Tricks: A Comprehensive Guide
  • Axolotl debugging guide
  • Hugging Face Hub API
  • NCCL
  • Training Phi 1.5 - Youtube
  • JSON (JavaScript Object Notation)
  • General Tips
  • Datasets
Powered by GitBook
LogoLogo

This documentation is for the Axolotl community

On this page
  • Model Variations
  • Architecture and Training
  • Instructions for Accessing and Using Llama v2 Models
  • Finding the Models
  • Download the model into your LLama2-recipe directory

Was this helpful?

Llama2

PreviousUploading ModelsNextLlama2 - Model Configuration

Last updated 1 year ago

Was this helpful?

Meta has introduced the Llama 2 family of large language models (LLMs), spanning from 7 billion to 70 billion parameters.

These models, particularly the Llama-2-Chat variants, are fine-tuned for dialogue applications and demonstrate superior performance compared to many open-source chat models, aligning closely with industry leaders like ChatGPT and PaLM in benchmarks of helpfulness and safety.

Model Variations

Llama 2 offers several sizes (7B, 13B, 70B) and includes both pretrained and fine-tuned models, with Llama-2-Chat being the standout for dialogue tasks.

Architecture and Training

Llama 2 employs an auto-regressive transformer architecture. The models are trained using a mix of publicly available data, with the 70B variant incorporating Grouped-Query Attention for enhanced inference scalability. Training occurred between January and July 2023.

Instructions for Accessing and Using Llama v2 Models

Finding the Models

  1. Access HuggingFace Hub:

    • Visit the .

    • Use the search bar to look for "Llama v2" models.

  2. Identify the Correct Models:

    • Look for models that have 'hf' in their name. For example, a model might be named llama-v2-hf.

    • Models with 'hf' in their name are already converted to HuggingFace checkpoints. This means they are ready to use and require no further conversion.

  3. Download or Use Models:

    • You can download these models directly for offline use, or you can reference them in your code by using their HuggingFace hub path.

Download the model into your LLama2-recipe directory

First, we must connect to Huggingface.

Create a repository to store a Huggingface token

git config --global credential.helper store

Then connect to Huggingface

huggingface-cli login

When asked to enter the authentication code, use:

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
What is a Huggingface User Token

User Access Tokens on the Hugging Face platform serve as a secure method for authenticating applications and notebooks to access Hugging Face services.

Purpose of User Access Tokens

  • Preferred Authentication Method: They are the recommended way to authenticate an application or notebook to Hugging Face services.

  • Management: Managed through the user's settings on the Hugging Face platform.

Scope and Roles

  • Read Role: Tokens with this role provide read access to both public and private repositories that the user or their organization owns. They are suitable for tasks like downloading private models or performing inference.

  • Write Role: In addition to read privileges, tokens with the write role can modify content in repositories where the user has write access. This includes creating or updating repository content, such as training or modifying a model card.

Organization API Tokens

  • Deprecated: Organization API Tokens have been deprecated. User Access Tokens now encompass permissions based on individual roles within an organization.

Managing User Access Tokens

  • Creation: Access tokens are created in the user's settings under the Access Tokens tab.

  • Customization: Users can select a role and name for each token.

  • Control: Tokens can be deleted or refreshed for security purposes.

Usage of User Access Tokens

  • Versatility: Can be used in various ways, including:

    • As a replacement for a password for Git operations or basic authentication.

    • As a bearer token for calling the Inference API.

    • Within Hugging Face Python libraries, like transformers or datasets, by passing the token for accessing private models or datasets.

  • Security Warning: Users are cautioned to safeguard their tokens to prevent unauthorized access to their private repositories.

Best Practices

  • Separate Tokens for Different Uses: Create distinct tokens for different applications or contexts (e.g., local machine, Colab notebook, custom servers).

  • Role Appropriateness: Assign only the necessary role to each token. If only read access is needed, limit the token to the read role.

  • Token Management: Regularly rotate and manage tokens to ensure security, especially if a token is suspected to be compromised.

In summary, User Access Tokens are a secure and flexible way to interact with the Hugging Face Hub, allowing for controlled access to resources based on defined roles. Proper management and usage of these tokens are crucial for maintaining security and operational efficiency.

When asked whether you want to add this as a git token credential, say yes. This should be the output:

Token is valid (permission: read).
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /home/paperspace/.cache/huggingface/token
Login successful

The install git large file storage so we can download the model (which is large)

git lfs install

If successful, the following output will be displayed in your terminal

Updated git hooks.
Git LFS initialized.
What is Git large file storage

Git Large File Storage (LFS) is an extension for Git that addresses issues related to handling large files and binary data in Git repositories.

Why Git LFS is Necessary

  1. Limitations with Large Files in Git:

    • Standard Git is excellent for handling text files (like code), which are typically small and benefit from Git's delta compression (storing only changes between versions). However, Git isn't optimized for large binary files (like images, videos, datasets, etc.). These files can dramatically increase the size of the repository and degrade performance.

  2. Efficiency and Performance:

    • Cloning and pulling from a repository with large files can be slow, consume a lot of bandwidth, and require significant storage space on every developer's machine. This inefficiency can hinder collaboration and development speed.

  3. Repository Bloat:

    • In standard Git, every version of every file is stored in the repository's history. This means that even if a large file is deleted from the current working directory, its history still resides in the Git database, leading to a bloated repository.

How Git LFS Works

  1. Pointer Files:

    • Git LFS replaces large files in your repository with tiny pointer files. When you commit a large file, Git LFS stores a reference (pointer) to that file in your repository, while the actual file data is stored in a separate server-side LFS storage.

  2. LFS Storage:

    • The large file contents are stored on a remote server configured for LFS, typically alongside your Git repository hosting service (like GitHub, GitLab, Bitbucket, etc.). This storage is separate from your main Git repository storage.

  3. Version Tracking:

    • Git LFS tracks versions of the large files separately. Every time you push or pull changes, Git LFS uploads or downloads the correct version of the large file from the LFS server, ensuring that you have the correct files in your working copy.

  4. Selective Download:

    • When cloning or pulling a repository, Git LFS downloads only the versions of large files needed for your current commit, reducing the time and bandwidth compared to downloading the entire history of every file.

  5. Compatibility:

    • Git LFS is compatible with existing Git services and workflows. It's an extension to Git, so you use the same Git commands. Repositories using Git LFS are still standard Git repos, ensuring backward compatibility.

Using Git LFS

  1. Installation:

    • First, install Git LFS on your machine. It's a separate software package that integrates with Git.

  2. Initialization:

    • In your Git repository, initialize Git LFS using git lfs install. This step configures Git to use LFS.

  3. Tracking Large Files:

    • Specify which files to track with LFS by file extension or pattern using git lfs track "*.ext", where *.ext is the pattern for files you want to track (like *.mp4 for video files).

  4. Commit and Push:

    • After tracking, these files are handled by Git LFS. When you commit and push these files, they are stored in the LFS storage.

In Summary

Git LFS is an essential tool for teams dealing with large binary files in their repositories. It maintains Git's efficiency and performance while handling large files in a more optimized manner. By storing the actual file data separately and using pointer files in the repository, Git LFS prevents repository bloat, speeds up operations like clone and pull, and makes handling large files more manageable without altering the standard Git workflow.

Enter the git clone command:

git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

The outcome should be new folder called LLama2-7b-chat-hf with all the appropriate files in it

Cloning Process:

Object Enumeration: The Git remote server started by enumerating objects. 
It found 88 objects in total.
Object Download: All 88 objects were successfully downloaded.
The delta 0 indicates that these objects did not have delta compression 
(differences between similar objects), which is common for initial clones.

Unpacking Objects: Git then unpacked these objects, which totalled 499.00 KiB in size.
The download speed was approximately 4.50 MiB/s.

Filtering Content with Git LFS: The repository uses Git LFS to handle large files. During the cloning process, Git LFS filtered and downloaded large files (totaling 9.10 GiB) at a rate of about 16.73 MiB/s.
Issues with Git LFS Files on Windows:

After cloning, there's a note about encountering issues with two files: 
model-00001-of-00002.safetensors and pytorch_model-00001-of-00002.bin.
These files, likely large due to being managed by Git LFS, may not have been copied 
correctly. This kind of issue can occur due to various reasons like LFS quota limits, network issues, or file system limitations, especially on Windows.

Reference to Git LFS Help:

The message See: 'git lfs help smudge' for more details suggests using the 
Git LFS smudge command for help. The smudge command in Git LFS is involved in converting
pointer files in Git LFS to the actual large files.

Overall Time Taken:

The entire cloning process took 9m17s (9 minutes and 17 seconds).

Congratulations, you have now downloaded the models.

HuggingFace hub
Page cover image