After fine tuning LLama3
The files you have obtained after fine-tuning the LLama3 model are the essential components needed to run inference with your fine-tuned model.
Here's what each file represents and how you can use them to run your fine-tuned LLM:
adapter_config.json
: This file contains the configuration settings for the adapter (LoRA) used during fine-tuning. It includes information such as the base model path, LoRA hyperparameters, target modules, and more.
adapter_model.bin
: This file contains the trained LoRA weights. It represents the learned adaptations to the base model during fine-tuning.
checkpoint-*
: These files (e.g., checkpoint-112
, checkpoint-28
, etc.) represent the saved model checkpoints at different stages of the training process. They contain the model's state at specific iterations or epochs.
config.json
: This file contains the configuration settings for the base model, such as the model architecture, hidden size, number of layers, and other hyperparameters.
README.md
: This file provides information about the fine-tuned model, including the training details, evaluation results, and any additional notes.
special_tokens_map.json
: This file maps special token names to their corresponding token IDs in the tokenizer.
tokenizer_config.json
and tokenizer.json
: These files contain the configuration and trained weights of the tokenizer used by the model.
To run inference with your fine-tuned LLM using these files, you can follow these steps:
Load the base model (LLama3) using the config.json
file
config.json
fileLoad the LoRA adapter using the adapter_config.json
and adapter_model.bin
files:
adapter_config.json
and adapter_model.bin
files:Load the tokenizer using the tokenizer_config.json
and tokenizer.json
files:
tokenizer_config.json
and tokenizer.json
files:Use the loaded model and tokenizer to run inference on your input text:
Make sure to replace "path/to/..."
with the actual paths to your files.
By following these steps, you can load your fine-tuned LLM and run inference on new input text using the trained LoRA adapter and tokenizer.
Remember to have the necessary dependencies installed, such as the transformers
and peft
libraries, and ensure that you have the required hardware (GPU) and sufficient memory to run the model.
You can refer to the README.md
file for any additional instructions or notes specific to your fine-tuned model.
Last updated