Meta has introduced the Llama 2 family of large language models (LLMs), spanning from 7 billion to 70 billion parameters.
These models, particularly the Llama-2-Chat variants, are fine-tuned for dialogue applications and demonstrate superior performance compared to many open-source chat models, aligning closely with industry leaders like ChatGPT and PaLM in benchmarks of helpfulness and safety.
Model Variations
Llama 2 offers several sizes (7B, 13B, 70B) and includes both pretrained and fine-tuned models, with Llama-2-Chat being the standout for dialogue tasks.
Architecture and Training
Llama 2 employs an auto-regressive transformer architecture. The models are trained using a mix of publicly available data, with the 70B variant incorporating Grouped-Query Attention for enhanced inference scalability. Training occurred between January and July 2023.
Instructions for Accessing and Using Llama v2 Models
Look for models that have 'hf' in their name. For example, a model might be named llama-v2-hf.
Models with 'hf' in their name are already converted to HuggingFace checkpoints. This means they are ready to use and require no further conversion.
Download or Use Models:
You can download these models directly for offline use, or you can reference them in your code by using their HuggingFace hub path.
Download the model into your LLama2-recipe directory
First, we must connect to Huggingface.
Create a repository to store a Huggingface token
gitconfig--globalcredential.helperstore
Then connect to Huggingface
huggingface-clilogin
When asked to enter the authentication code, use:
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
What is a Huggingface User Token
User Access Tokens on the Hugging Face platform serve as a secure method for authenticating applications and notebooks to access Hugging Face services.
Purpose of User Access Tokens
Preferred Authentication Method: They are the recommended way to authenticate an application or notebook to Hugging Face services.
Management: Managed through the user's settings on the Hugging Face platform.
Scope and Roles
Read Role: Tokens with this role provide read access to both public and private repositories that the user or their organization owns. They are suitable for tasks like downloading private models or performing inference.
Write Role: In addition to read privileges, tokens with the write role can modify content in repositories where the user has write access. This includes creating or updating repository content, such as training or modifying a model card.
Organization API Tokens
Deprecated: Organization API Tokens have been deprecated. User Access Tokens now encompass permissions based on individual roles within an organization.
Managing User Access Tokens
Creation: Access tokens are created in the user's settings under the Access Tokens tab.
Customization: Users can select a role and name for each token.
Control: Tokens can be deleted or refreshed for security purposes.
Usage of User Access Tokens
Versatility: Can be used in various ways, including:
As a replacement for a password for Git operations or basic authentication.
As a bearer token for calling the Inference API.
Within Hugging Face Python libraries, like transformers or datasets, by passing the token for accessing private models or datasets.
Security Warning: Users are cautioned to safeguard their tokens to prevent unauthorized access to their private repositories.
Best Practices
Separate Tokens for Different Uses: Create distinct tokens for different applications or contexts (e.g., local machine, Colab notebook, custom servers).
Role Appropriateness: Assign only the necessary role to each token. If only read access is needed, limit the token to the read role.
Token Management: Regularly rotate and manage tokens to ensure security, especially if a token is suspected to be compromised.
In summary, User Access Tokens are a secure and flexible way to interact with the Hugging Face Hub, allowing for controlled access to resources based on defined roles. Proper management and usage of these tokens are crucial for maintaining security and operational efficiency.
When asked whether you want to add this as a git token credential, say yes. This should be the output:
Token is valid (permission: read).
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /home/paperspace/.cache/huggingface/token
Login successful
The install git large file storage so we can download the model (which is large)
gitlfsinstall
If successful, the following output will be displayed in your terminal
Updatedgithooks.GitLFSinitialized.
What is Git large file storage
Git Large File Storage (LFS) is an extension for Git that addresses issues related to handling large files and binary data in Git repositories.
Why Git LFS is Necessary
Limitations with Large Files in Git:
Standard Git is excellent for handling text files (like code), which are typically small and benefit from Git's delta compression (storing only changes between versions). However, Git isn't optimized for large binary files (like images, videos, datasets, etc.). These files can dramatically increase the size of the repository and degrade performance.
Efficiency and Performance:
Cloning and pulling from a repository with large files can be slow, consume a lot of bandwidth, and require significant storage space on every developer's machine. This inefficiency can hinder collaboration and development speed.
Repository Bloat:
In standard Git, every version of every file is stored in the repository's history. This means that even if a large file is deleted from the current working directory, its history still resides in the Git database, leading to a bloated repository.
How Git LFS Works
Pointer Files:
Git LFS replaces large files in your repository with tiny pointer files. When you commit a large file, Git LFS stores a reference (pointer) to that file in your repository, while the actual file data is stored in a separate server-side LFS storage.
LFS Storage:
The large file contents are stored on a remote server configured for LFS, typically alongside your Git repository hosting service (like GitHub, GitLab, Bitbucket, etc.). This storage is separate from your main Git repository storage.
Version Tracking:
Git LFS tracks versions of the large files separately. Every time you push or pull changes, Git LFS uploads or downloads the correct version of the large file from the LFS server, ensuring that you have the correct files in your working copy.
Selective Download:
When cloning or pulling a repository, Git LFS downloads only the versions of large files needed for your current commit, reducing the time and bandwidth compared to downloading the entire history of every file.
Compatibility:
Git LFS is compatible with existing Git services and workflows. It's an extension to Git, so you use the same Git commands. Repositories using Git LFS are still standard Git repos, ensuring backward compatibility.
Using Git LFS
Installation:
First, install Git LFS on your machine. It's a separate software package that integrates with Git.
Initialization:
In your Git repository, initialize Git LFS using git lfs install. This step configures Git to use LFS.
Tracking Large Files:
Specify which files to track with LFS by file extension or pattern using git lfs track "*.ext", where *.ext is the pattern for files you want to track (like *.mp4 for video files).
Commit and Push:
After tracking, these files are handled by Git LFS. When you commit and push these files, they are stored in the LFS storage.
In Summary
Git LFS is an essential tool for teams dealing with large binary files in their repositories. It maintains Git's efficiency and performance while handling large files in a more optimized manner. By storing the actual file data separately and using pointer files in the repository, Git LFS prevents repository bloat, speeds up operations like clone and pull, and makes handling large files more manageable without altering the standard Git workflow.
The outcome should be new folder called LLama2-7b-chat-hf with all the appropriate files in it
Cloning Process:
Object Enumeration: The Git remote server started by enumerating objects.
It found 88 objects in total.
Object Download: All 88 objects were successfully downloaded.
The delta 0 indicates that these objects did not have delta compression
(differences between similar objects), which is common for initial clones.
Unpacking Objects: Git then unpacked these objects, which totalled 499.00 KiB in size.
The download speed was approximately 4.50 MiB/s.
Filtering Content with Git LFS: The repository uses Git LFS to handle large files. During the cloning process, Git LFS filtered and downloaded large files (totaling 9.10 GiB) at a rate of about 16.73 MiB/s.
Issues with Git LFS Files on Windows:
After cloning, there's a note about encountering issues with two files:
model-00001-of-00002.safetensors and pytorch_model-00001-of-00002.bin.
These files, likely large due to being managed by Git LFS, may not have been copied
correctly. This kind of issue can occur due to various reasons like LFS quota limits, network issues, or file system limitations, especially on Windows.
Reference to Git LFS Help:
The message See: 'git lfs help smudge' for more details suggests using the
Git LFS smudge command for help. The smudge command in Git LFS is involved in converting
pointer files in Git LFS to the actual large files.
Overall Time Taken:
The entire cloning process took 9m17s (9 minutes and 17 seconds).
Congratulations, you have now downloaded the models.