LogoLogo
Continuum Knowledge BankContinuum Applications
  • Introduction
  • Creation of Environment
    • Platform Installation
    • Axolotl Dependencies
    • setup.py objectives
      • script analysis
  • Huggingface Hub
  • Download the dataset
    • Types of Dataset Structures
    • Structuring Datasets for Fine-Tuning Large Language Models
    • Downloading Huggingface Datasets
    • Use Git to download dataset
    • Popular Datasets
    • Download cleaned Alpaca dataset
    • Template-free prompt construction
  • Downloading models
    • Phi 2.0 details
    • Downloading Phi 2.0
    • Available Models
  • Configuration for Training
  • Datasets
  • Model Selection - General
  • Phi 2.0
    • Phi 2.0 - Model Configuration
    • Phi 2.0 - Model Quantization
    • Phi 2.0 - Data Loading and Paths
    • Phi 2.0 - Sequence Configuration
    • Phi 2.0 - Lora Configuration
    • Phi 2.0 - Logging
    • Phi 2.0 - Training Configuration
    • Phi 2.0 - Data and Precision
    • Phi 2.0 - Optimisations
    • Phi 2.0 - Extra Hyperparameters
    • Phi 2.0 - All Configurations
    • Phi 2.0 - Preprocessing
    • Phi 2.0 - Training
    • Uploading Models
  • Llama2
    • Llama2 - Model Configuration
    • Llama2 - Model Quantization
    • Llama2 - Data Loading and Paths
    • Llama2 - Sequence Configuration
    • Llama2 - Lora Configuration
    • Llama2 - Logging
    • Llama2 - Training Configuration
    • Llama2 - Data and Precision
    • Llama2 - Optimisations
    • Llama2 - Extra Hyperparameters
    • Llama2- All Configurations
    • Llama2 - Training Configuration
    • Llama2 - Preprocessing
    • Llama2 - Training
  • Llama3
    • Downloading the model
    • Analysis of model files
      • Model Analysis - Configuration Parameters
      • Model Analysis - Safetensors
      • Tokenizer Configuration Files
        • Model Analysis - tokenizer.json
        • Model Analysis - Special Tokens
    • Llama3 - Model Configuration
    • Llama3 - Model Quantization
    • Llama3 - Data Loading and Paths
    • Llama3 - Sequence Configuration
    • Llama3 - Lora Configuration
    • Llama3 - Logging
    • Llama3 - Training Configuration
    • Llama3 - Data and Precision
    • Llama3 - Optimisations
    • Llama3 - Extra Hyperparameters
    • Llama3- All Configurations
    • Llama3 - Preprocessing
    • Llama3 - Training
    • Full Fine Tune
  • Special Tokens
  • Prompt Construction for Fine-Tuning Large Language Models
  • Memory-Efficient Fine-Tuning Techniques for Large Language Models
  • Training Ideas around Hyperparameters
    • Hugging Face documentation on loading PEFT
  • After fine tuning LLama3
  • Merging Model Weights
  • Merge Lora Instructions
  • Axolotl Configuration Files
    • Configuration Options
    • Model Configuration
    • Data Loading and Processing
    • Sequence Configuration
    • Lora Configuration
    • Logging
    • Training Configuration
    • Augmentation Techniques
  • Axolotl Fine-Tuning Tips & Tricks: A Comprehensive Guide
  • Axolotl debugging guide
  • Hugging Face Hub API
  • NCCL
  • Training Phi 1.5 - Youtube
  • JSON (JavaScript Object Notation)
  • General Tips
  • Datasets
Powered by GitBook
LogoLogo

This documentation is for the Axolotl community

On this page

Was this helpful?

  1. Creation of Environment

Axolotl Dependencies

The libraries contained within the axolotl virtual environment

AutoGPTQ

AutoGPTQ is a software package designed to facilitate the quantization of large language models (LLMs), such as those based on the GPT architecture.

It provides a user-friendly API that enables developers and researchers to efficiently apply weight-only quantization to these models using the GPTQ algorithm. Here's a detailed explanation of the key aspects and features of AutoGPTQ:

Key Features

Weight-Only Quantization: AutoGPTQ focuses on quantizing the weights of a model, which can significantly reduce the memory footprint and improve the computational efficiency of the model without requiring extensive changes to the model architecture or training procedures.

User-Friendly APIs: The package is designed to be easy to use, making the process of quantizing models accessible to a wider audience, including those who may not have a deep background in machine learning or optimization.

Integration with Popular Tools: AutoGPTQ is integrated with the transformers library by Hugging Face, along with other optimization tools like optimum and peft. This integration allows for seamless quantization within existing workflows that use these popular libraries.

Performance Improvements

AutoGPTQ includes support for specialized hardware acceleration, such as Marlin int4*fp16 matrix multiplication kernels, which can dramatically increase the speed of inference when using quantized models. The performance comparison provided shows that quantized models can achieve higher throughput in terms of tokens per second compared

Library/Package
Description

auto-gptq

See expandable

packaging

A library for parsing, comparing, and manipulating version numbers and version specifiers

peft

A library for efficient fine-tuning of large language models using parameter-efficient techniques like adapter modules and prefix tuning.

transformers

Huggingface Transformer library

tokenizers

A fast and efficient tokenization library for NLP tasks

bitsandbytes

A library for quantization and compression techniques in deep learning models.

accelerate

A library for simplifying distributed training and mixed precision training in PyTorch

deepspeed

A deep learning optimization library for training large models efficiently, providing techniques like ZeRO (Zero Redundancy Optimizer).

pydantic

A library for data validation and settings management using Python type annotations

addict

A dictionary library that allows accessing nested dictionaries using dot notation

fire

A Python library by Google that automatically generates command-line interfaces (CLIs) from Python code.

PyYAML

A YAML parser and emitter for Python, used for serializing and deserialising YAML data.

datasets

A library providing a unified interface for loading and manipulating datasets in NLP and machine learning.

flash-attn

A library for fast and memory-efficient attention mechanisms in transformer models.

sentencepiece

A tokenization library for subword tokenization and byte-pair encoding (BPE)

wandb

A tool for visualising and tracking machine learning experiments, making it easier to share results and progress.

einops

Provides a more readable and flexible way to write tensor operations in deep learning models.

xformers

library for efficient transformer implementations and attention mechanisms

optimum

A library for optimizing and quantizing transformer models for deployment

hf_transfer

A library for transfer learning and fine-tuning models from the Hugging Face ecosystem

colorama

A Python module for producing colored terminal text and cursor positioning on multiple platforms.

numba

A high-performance Python compiler that accelerates computation by generating native machine code.

numpy

A package for scientific computing with Python, widely used for multi-dimensional array and matrix processing.

bert-score==0.3.13

A specific version of a library that provides an evaluation metric for text generation.

evaluate==0.4.0

An unknown library, possibly related to model evaluation or metrics.

rouge-score==0.1.2

A specific version of a library used for evaluating summarization and other text generation tasks by comparing against reference texts.

scipy

A Python library used for scientific and technical computing, offering modules for optimization, integration, and statistics.

scikit-learn

A popular machine learning library in Python, known for its easy-to-use API and performance in classification, regression, and clustering algorithms.

pynvml

A Python wrapper for NVIDIA Management Library, used for monitoring and managing NVIDIA GPU devices.

art

An unknown library, potentially related to ASCII art rendering or similar artistic transformations in text.

fschat

A library for building chatbots and conversational AI systems

gradio

An easy-to-use library to rapidly create UIs for machine learning models, enabling quick prototyping and sharing of ML models.

tensorboard

A visualization toolkit for machine learning experimentation, providing metrics and visualizations for TensorFlow projects.

s3fs

A Pythonic file interface to S3, allowing easy interaction with Amazon S3 buckets as if they are local files.

gcsfs

A Pythonic file system interface to Google Cloud Storage, similar in functionality to s3fs but for Google's cloud storage solution.

mamba-ssm

A library for secure secret management in Python

trl

A library for reinforcement learning in transformers

zstandard

A compression library providing a fast and efficient compression algorithm

fastcore

A library of core utilities for fast.ai, a deep learning library

PreviousPlatform InstallationNextsetup.py objectives

Last updated 1 year ago

Was this helpful?

Page cover image