# After fine tuning LLama3

The files you have obtained after fine-tuning the LLama3 model are the essential components needed to run inference with your fine-tuned model.&#x20;

Here's what each file represents and how you can use them to run your fine-tuned LLM:

<mark style="color:yellow;">**`adapter_config.json`**</mark><mark style="color:yellow;">**:**</mark> This file contains the configuration settings for the adapter (LoRA) used during fine-tuning. It includes information such as the base model path, LoRA hyperparameters, target modules, and more.

<mark style="color:yellow;">**`adapter_model.bin`**</mark><mark style="color:yellow;">**:**</mark> This file contains the trained LoRA weights. It represents the learned adaptations to the base model during fine-tuning.

<mark style="color:yellow;">**`checkpoint-*`**</mark><mark style="color:yellow;">**:**</mark> These files (e.g., `checkpoint-112`, `checkpoint-28`, etc.) represent the saved model checkpoints at different stages of the training process. They contain the model's state at specific iterations or epochs.

<mark style="color:yellow;">**`config.json`**</mark><mark style="color:yellow;">**:**</mark> This file contains the configuration settings for the base model, such as the model architecture, hidden size, number of layers, and other hyperparameters.

<mark style="color:yellow;">**`README.md`**</mark><mark style="color:yellow;">**:**</mark> This file provides information about the fine-tuned model, including the training details, evaluation results, and any additional notes.

<mark style="color:yellow;">**`special_tokens_map.json`**</mark><mark style="color:yellow;">**:**</mark> This file maps special token names to their corresponding token IDs in the tokenizer.

<mark style="color:yellow;">**`tokenizer_config.json`**</mark> and <mark style="color:yellow;">**`tokenizer.json`**</mark>: These files contain the configuration and trained weights of the tokenizer used by the model.

To run inference with your fine-tuned LLM using these files, you can follow these steps:

#### <mark style="color:green;">Load the base model (LLama3) using the</mark> <mark style="color:green;"></mark><mark style="color:green;">`config.json`</mark> <mark style="color:green;"></mark><mark style="color:green;">file</mark>

```python
from transformers import LlamaForCausalLM, AutoConfig

config = AutoConfig.from_pretrained("path/to/config.json")
model = LlamaForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", config=config)
```

#### <mark style="color:green;">Load the LoRA adapter using the</mark> <mark style="color:green;"></mark><mark style="color:green;">`adapter_config.json`</mark> <mark style="color:green;"></mark><mark style="color:green;">and</mark> <mark style="color:green;"></mark><mark style="color:green;">`adapter_model.bin`</mark> <mark style="color:green;"></mark><mark style="color:green;">files:</mark>

```python
from peft import PeftModel

model = PeftModel.from_pretrained(model, "path/to/adapter_model.bin")
```

#### <mark style="color:green;">Load the tokenizer using the</mark> <mark style="color:green;"></mark><mark style="color:green;">`tokenizer_config.json`</mark> <mark style="color:green;"></mark><mark style="color:green;">and</mark> <mark style="color:green;"></mark><mark style="color:green;">`tokenizer.json`</mark> <mark style="color:green;"></mark><mark style="color:green;">files:</mark>

```python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("path/to/tokenizer_config.json")
```

#### <mark style="color:green;">Use the loaded model and tokenizer to run inference on your input text:</mark>

```python
input_text = "Your input text here"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(input_ids, max_length=100)

generated_text = tokenizer.decode(outputs[0])
print(generated_text)
```

Make sure to replace `"path/to/..."` with the actual paths to your files.

By following these steps, you can load your fine-tuned LLM and run inference on new input text using the trained LoRA adapter and tokenizer.

Remember to have the necessary dependencies installed, such as the `transformers` and `peft` libraries, and ensure that you have the required hardware (GPU) and sufficient memory to run the model.

You can refer to the `README.md` file for any additional instructions or notes specific to your fine-tuned model.
