LogoLogo
Continuum Knowledge BankContinuum Applications
  • Introduction
  • Creation of Environment
    • Platform Installation
    • Axolotl Dependencies
    • setup.py objectives
      • script analysis
  • Huggingface Hub
  • Download the dataset
    • Types of Dataset Structures
    • Structuring Datasets for Fine-Tuning Large Language Models
    • Downloading Huggingface Datasets
    • Use Git to download dataset
    • Popular Datasets
    • Download cleaned Alpaca dataset
    • Template-free prompt construction
  • Downloading models
    • Phi 2.0 details
    • Downloading Phi 2.0
    • Available Models
  • Configuration for Training
  • Datasets
  • Model Selection - General
  • Phi 2.0
    • Phi 2.0 - Model Configuration
    • Phi 2.0 - Model Quantization
    • Phi 2.0 - Data Loading and Paths
    • Phi 2.0 - Sequence Configuration
    • Phi 2.0 - Lora Configuration
    • Phi 2.0 - Logging
    • Phi 2.0 - Training Configuration
    • Phi 2.0 - Data and Precision
    • Phi 2.0 - Optimisations
    • Phi 2.0 - Extra Hyperparameters
    • Phi 2.0 - All Configurations
    • Phi 2.0 - Preprocessing
    • Phi 2.0 - Training
    • Uploading Models
  • Llama2
    • Llama2 - Model Configuration
    • Llama2 - Model Quantization
    • Llama2 - Data Loading and Paths
    • Llama2 - Sequence Configuration
    • Llama2 - Lora Configuration
    • Llama2 - Logging
    • Llama2 - Training Configuration
    • Llama2 - Data and Precision
    • Llama2 - Optimisations
    • Llama2 - Extra Hyperparameters
    • Llama2- All Configurations
    • Llama2 - Training Configuration
    • Llama2 - Preprocessing
    • Llama2 - Training
  • Llama3
    • Downloading the model
    • Analysis of model files
      • Model Analysis - Configuration Parameters
      • Model Analysis - Safetensors
      • Tokenizer Configuration Files
        • Model Analysis - tokenizer.json
        • Model Analysis - Special Tokens
    • Llama3 - Model Configuration
    • Llama3 - Model Quantization
    • Llama3 - Data Loading and Paths
    • Llama3 - Sequence Configuration
    • Llama3 - Lora Configuration
    • Llama3 - Logging
    • Llama3 - Training Configuration
    • Llama3 - Data and Precision
    • Llama3 - Optimisations
    • Llama3 - Extra Hyperparameters
    • Llama3- All Configurations
    • Llama3 - Preprocessing
    • Llama3 - Training
    • Full Fine Tune
  • Special Tokens
  • Prompt Construction for Fine-Tuning Large Language Models
  • Memory-Efficient Fine-Tuning Techniques for Large Language Models
  • Training Ideas around Hyperparameters
    • Hugging Face documentation on loading PEFT
  • After fine tuning LLama3
  • Merging Model Weights
  • Merge Lora Instructions
  • Axolotl Configuration Files
    • Configuration Options
    • Model Configuration
    • Data Loading and Processing
    • Sequence Configuration
    • Lora Configuration
    • Logging
    • Training Configuration
    • Augmentation Techniques
  • Axolotl Fine-Tuning Tips & Tricks: A Comprehensive Guide
  • Axolotl debugging guide
  • Hugging Face Hub API
  • NCCL
  • Training Phi 1.5 - Youtube
  • JSON (JavaScript Object Notation)
  • General Tips
  • Datasets
Powered by GitBook
LogoLogo

This documentation is for the Axolotl community

On this page

Was this helpful?

After fine tuning LLama3

The files you have obtained after fine-tuning the LLama3 model are the essential components needed to run inference with your fine-tuned model.

Here's what each file represents and how you can use them to run your fine-tuned LLM:

adapter_config.json: This file contains the configuration settings for the adapter (LoRA) used during fine-tuning. It includes information such as the base model path, LoRA hyperparameters, target modules, and more.

adapter_model.bin: This file contains the trained LoRA weights. It represents the learned adaptations to the base model during fine-tuning.

checkpoint-*: These files (e.g., checkpoint-112, checkpoint-28, etc.) represent the saved model checkpoints at different stages of the training process. They contain the model's state at specific iterations or epochs.

config.json: This file contains the configuration settings for the base model, such as the model architecture, hidden size, number of layers, and other hyperparameters.

README.md: This file provides information about the fine-tuned model, including the training details, evaluation results, and any additional notes.

special_tokens_map.json: This file maps special token names to their corresponding token IDs in the tokenizer.

tokenizer_config.json and tokenizer.json: These files contain the configuration and trained weights of the tokenizer used by the model.

To run inference with your fine-tuned LLM using these files, you can follow these steps:

Load the base model (LLama3) using the config.json file

from transformers import LlamaForCausalLM, AutoConfig

config = AutoConfig.from_pretrained("path/to/config.json")
model = LlamaForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", config=config)

Load the LoRA adapter using the adapter_config.json and adapter_model.bin files:

from peft import PeftModel

model = PeftModel.from_pretrained(model, "path/to/adapter_model.bin")

Load the tokenizer using the tokenizer_config.json and tokenizer.json files:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("path/to/tokenizer_config.json")

Use the loaded model and tokenizer to run inference on your input text:

input_text = "Your input text here"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(input_ids, max_length=100)

generated_text = tokenizer.decode(outputs[0])
print(generated_text)

Make sure to replace "path/to/..." with the actual paths to your files.

By following these steps, you can load your fine-tuned LLM and run inference on new input text using the trained LoRA adapter and tokenizer.

Remember to have the necessary dependencies installed, such as the transformers and peft libraries, and ensure that you have the required hardware (GPU) and sufficient memory to run the model.

You can refer to the README.md file for any additional instructions or notes specific to your fine-tuned model.

PreviousHugging Face documentation on loading PEFTNextMerging Model Weights

Last updated 1 year ago

Was this helpful?

Page cover image