Steps to Build and Fine-Tune a Foundational LLM with Mistral on TIR

May 5, 2024

Building a Chatbot

E2E Network’s TIR AI platform integrates seamlessly with Jupyter Notebook, the data scientist's favorite tool.

To set up E2E Cloud TIR AI Jupyter Notebook, follow this link: https://www.e2enetworks.com/blog/how-to-use-jupyter-notebooks-on-e2e-networks.

Let’s Play

E2E Cloud's Jupyter Notebook makes installation a breeze. Just use these magic commands to install LLaMA Factory and get your chatbot project on the cloud:

%rm -rf LLaMA-Factory

!git clone https://github.com/hiyouga/LLaMA-Factory.git

%cd LLaMA-Factory

%ls

!pip install .

Next, import the necessary packages by executing:


from llmtuner import run_exp, ChatModel, export_model


import torchtorch.cuda.is_available()

E2E Cloud takes care of CUDA setup for you.

Fine-Tuning the Mistral Model for Question-Answering

We'll use LLaMA Factory to fine-tune the Mistral model specifically for the task of question-answering.

Dataset Playground

For this task, we'll leverage three datasets from Hugging Face to fine-tune our model: "identity," "alpaca_gpt4_en," and "alpaca_gpt4_zh." These datasets provide a solid foundation for question-answering tasks.


# Here we train the model and export it to a directory called "E2E_Mistral-7B-ChatBot"
run_exp(dict(
  stage="sft",
  do_train=True,
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.2",
  dataset="identity,alpaca_gpt4_en,alpaca_gpt4_zh",
  template="mistral",
  finetuning_type="lora",
  lora_target="all",
  output_dir="E2E_Mistral-7B-ChatBot",
  per_device_train_batch_size=4,
  gradient_accumulation_steps=4,
  lr_scheduler_type="cosine",
  logging_steps=10,
  save_steps=100,
  learning_rate=1e-4,
  num_train_epochs=5.0,
  max_samples=500,
  max_grad_norm=1.0,
  fp16=True,
))

Now that we've configured the core elements, let's delve into the world of hyperparameters. These are the control knobs that fine-tune the learning process of your chatbot. Feel free to experiment with them to personalize your chatbot's behavior and optimize its performance.

Key Hyperparameters

model_name_or_path: This critical hyperparameter dictates the base model you're using. In our case, it's set to "Mistral," but remember, you can explore other options within the Mistral family.
template: This hyperparameter defines the underlying architecture used for fine-tuning. Understanding the template's role can help you choose the most suitable one for your specific task.
dataset: Here, you specify the datasets you want to use for training.
fine_tune_type: This hyperparameter allows you to select between fine-tuning with LoRA or QLoRA. These are advanced techniques that LLaMA Factory offers to optimize training efficiency.

Beyond the Basics

We've highlighted some essential hyperparameters, but LLaMA Factory offers a wider range for you to explore. These include:

epochs: This hyperparameter controls the number of times the model iterates through the training data. Adjusting it can influence the model's learning and avoid overfitting.
save_steps: This hyperparameter determines how often the model's progress is saved during training. It allows you to track the training process and potentially revert to earlier stages if needed.
learning_rate: This hyperparameter governs the step size the model takes when updating its internal parameters during training. Setting an appropriate learning rate is crucial for achieving optimal performance.
output_dir: This hyperparameter specifies the directory where the fine-tuned model and training logs will be saved. Keeping track of these outputs is essential for analyzing the training process and evaluating the final model.

By understanding and adjusting these hyperparameters, you can transform your chatbot from a basic model into a powerful and customized AI companion.

The training process can take some time to complete. The exact duration depends on several factors:

GPU Power: The processing muscle of your GPU significantly impacts training speed.
Model Architecture: The complexity of the chosen model architecture also plays a role. Simpler models generally train faster than their more intricate counterparts.
Hyperparameter Tuning: The hyperparameters you've configured can influence training time. Optimizing these settings can sometimes lead to faster training without sacrificing accuracy.

Once the training is complete, it's time to preserve your creation. LLaMA Factory conveniently stores checkpoints of the trained model within a dedicated folder inside its directory (‘output_dir’ hyperparameter).

The following code snippet allows you to effortlessly export your fine-tuned model to a directory named "E2E-ChatBot" within your E2E Cloud workspace. This way, you have a readily accessible copy of your chatbot, ready for further use and exploration.


export_model(dict(
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.2",
  adapter_name_or_path="E2E_Mistral-7B-ChatBot",
  finetuning_type="lora",
  template="mistral",
  export_dir="E2E-ChatBot",
))

You've successfully trained and saved your very own chatbot. Now, it's time to unleash its potential.


E2E_Chat_Bot = ChatModel(
    dict(
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.2",
  adapter_name_or_path="E2E_Mistral-7B-ChatBot",
  finetuning_type="lora",
  template="mistral",
))


messages = []
while True:
  query = input("\nUser: ")
  if query.strip() == "exit":
    break
  if query.strip() == "clear":
    messages = []
    continue
  messages.append({"role": "user", "content": query})
  print("E2E_Chat_Bot: ", end="", flush=True)
  response = ""
  for new_text in E2E_Chat_Bot.stream_chat(messages):
    response += new_text
  for r in response.split('.'):
    print(r.strip(), flush=True)
  print()
  messages.append({"role": "assistant", "content": response})

The provided code snippet acts as a bridge between you and your creation. Simply type your questions or prompts, and the chatbot will analyze them, using its fine-tuned knowledge to craft informative and engaging responses.

Here are some handy commands to keep in mind:

Ask questions, give it instructions, or simply have a conversation. The more you interact, the better it may understand your preferences and communication style.
Start fresh (clear): If you feel the conversation has gone off track, use the clear command. This will reset the chatbot's internal state, giving you a clean slate for a new interaction.
Exit the conversation (exit): When you're done exploring your chatbot's capabilities, use the exit command to conclude the session.

‍

Here is an example conversation from the awesome chatbot we have built:

‍

User: Hi

E2E_Chat_Bot: Hello! How can I help you today?

User: Do you know about E2E cloud network?

E2E_Chat_Bot: Yes, I am familiar with E2E Cloud Network

E2E Cloud Network is a technology that allows companies to host their applications and services in the cloud, allowing them to easily access and manage their resources and applications from anywhere in the world

It is a powerful solution that enables organizations to leverage the capabilities of the cloud to improve their operations and growth

User: can it also be used to train and deploy AI models?

E2E_Chat_Bot: Yes, E2E Cloud Network can be used to train and deploy AI models

In fact, it is a popular choice for organizations that want to deploy AI models in the cloud, as it enables them to easily scale and deploy their models to meet the needs of their users

E2E Cloud Network allows for easy deployment of AI models, with features such as containerization, automatic scaling, and automatic deployment, which can help organizations to reduce the time and effort required to deploy and manage their AI models

User: exit

GitHub

Congratulations! You've successfully built your very own chatbot.

The GitHub code used can be found at: https://github.com/Lord-Axy/Arrticle-Chat-Bot/blob/master/code.ipynb

Sign up for Free Trial

Latest Blogs

A vector illustration of a tech city using latest cloud technologies & infrastructure

Steps to Build and Fine-Tune a Foundational LLM with Mistral on TIR

May 5, 2024

Akshayraj Madhubalan

Building a Chatbot

E2E Network’s TIR AI platform integrates seamlessly with Jupyter Notebook, the data scientist's favorite tool.

To set up E2E Cloud TIR AI Jupyter Notebook, follow this link: https://www.e2enetworks.com/blog/how-to-use-jupyter-notebooks-on-e2e-networks.

Let’s Play

E2E Cloud's Jupyter Notebook makes installation a breeze. Just use these magic commands to install LLaMA Factory and get your chatbot project on the cloud:

%rm -rf LLaMA-Factory

!git clone https://github.com/hiyouga/LLaMA-Factory.git

%cd LLaMA-Factory

%ls

!pip install .

Next, import the necessary packages by executing:


from llmtuner import run_exp, ChatModel, export_model


import torchtorch.cuda.is_available()

E2E Cloud takes care of CUDA setup for you.

Fine-Tuning the Mistral Model for Question-Answering

We'll use LLaMA Factory to fine-tune the Mistral model specifically for the task of question-answering.

Dataset Playground


# Here we train the model and export it to a directory called "E2E_Mistral-7B-ChatBot"
run_exp(dict(
  stage="sft",
  do_train=True,
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.2",
  dataset="identity,alpaca_gpt4_en,alpaca_gpt4_zh",
  template="mistral",
  finetuning_type="lora",
  lora_target="all",
  output_dir="E2E_Mistral-7B-ChatBot",
  per_device_train_batch_size=4,
  gradient_accumulation_steps=4,
  lr_scheduler_type="cosine",
  logging_steps=10,
  save_steps=100,
  learning_rate=1e-4,
  num_train_epochs=5.0,
  max_samples=500,
  max_grad_norm=1.0,
  fp16=True,
))

Key Hyperparameters

model_name_or_path: This critical hyperparameter dictates the base model you're using. In our case, it's set to "Mistral," but remember, you can explore other options within the Mistral family.
template: This hyperparameter defines the underlying architecture used for fine-tuning. Understanding the template's role can help you choose the most suitable one for your specific task.
dataset: Here, you specify the datasets you want to use for training.
fine_tune_type: This hyperparameter allows you to select between fine-tuning with LoRA or QLoRA. These are advanced techniques that LLaMA Factory offers to optimize training efficiency.

Beyond the Basics

We've highlighted some essential hyperparameters, but LLaMA Factory offers a wider range for you to explore. These include:

epochs: This hyperparameter controls the number of times the model iterates through the training data. Adjusting it can influence the model's learning and avoid overfitting.
save_steps: This hyperparameter determines how often the model's progress is saved during training. It allows you to track the training process and potentially revert to earlier stages if needed.
learning_rate: This hyperparameter governs the step size the model takes when updating its internal parameters during training. Setting an appropriate learning rate is crucial for achieving optimal performance.
output_dir: This hyperparameter specifies the directory where the fine-tuned model and training logs will be saved. Keeping track of these outputs is essential for analyzing the training process and evaluating the final model.

By understanding and adjusting these hyperparameters, you can transform your chatbot from a basic model into a powerful and customized AI companion.

The training process can take some time to complete. The exact duration depends on several factors:

GPU Power: The processing muscle of your GPU significantly impacts training speed.
Model Architecture: The complexity of the chosen model architecture also plays a role. Simpler models generally train faster than their more intricate counterparts.
Hyperparameter Tuning: The hyperparameters you've configured can influence training time. Optimizing these settings can sometimes lead to faster training without sacrificing accuracy.


export_model(dict(
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.2",
  adapter_name_or_path="E2E_Mistral-7B-ChatBot",
  finetuning_type="lora",
  template="mistral",
  export_dir="E2E-ChatBot",
))

You've successfully trained and saved your very own chatbot. Now, it's time to unleash its potential.


E2E_Chat_Bot = ChatModel(
    dict(
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.2",
  adapter_name_or_path="E2E_Mistral-7B-ChatBot",
  finetuning_type="lora",
  template="mistral",
))


messages = []
while True:
  query = input("\nUser: ")
  if query.strip() == "exit":
    break
  if query.strip() == "clear":
    messages = []
    continue
  messages.append({"role": "user", "content": query})
  print("E2E_Chat_Bot: ", end="", flush=True)
  response = ""
  for new_text in E2E_Chat_Bot.stream_chat(messages):
    response += new_text
  for r in response.split('.'):
    print(r.strip(), flush=True)
  print()
  messages.append({"role": "assistant", "content": response})

Here are some handy commands to keep in mind:

Ask questions, give it instructions, or simply have a conversation. The more you interact, the better it may understand your preferences and communication style.
Start fresh (clear): If you feel the conversation has gone off track, use the clear command. This will reset the chatbot's internal state, giving you a clean slate for a new interaction.
Exit the conversation (exit): When you're done exploring your chatbot's capabilities, use the exit command to conclude the session.

‍

Here is an example conversation from the awesome chatbot we have built:

‍

User: Hi

E2E_Chat_Bot: Hello! How can I help you today?

User: Do you know about E2E cloud network?

E2E_Chat_Bot: Yes, I am familiar with E2E Cloud Network

It is a powerful solution that enables organizations to leverage the capabilities of the cloud to improve their operations and growth

User: can it also be used to train and deploy AI models?

E2E_Chat_Bot: Yes, E2E Cloud Network can be used to train and deploy AI models

In fact, it is a popular choice for organizations that want to deploy AI models in the cloud, as it enables them to easily scale and deploy their models to meet the needs of their users

User: exit

GitHub

Congratulations! You've successfully built your very own chatbot.

The GitHub code used can be found at: https://github.com/Lord-Axy/Arrticle-Chat-Bot/blob/master/code.ipynb

Sign up for Free Trial

Latest Blogs

Steps to Build and Fine-Tune a Foundational LLM with Mistral on TIR

Table of Contents

Building a Chatbot

Let’s Play

Fine-Tuning the Mistral Model for Question-Answering

Dataset Playground

Key Hyperparameters

Beyond the Basics

GitHub

Steps to Build and Fine-Tune a Foundational LLM with Mistral on TIR

Table of Contents

Building a Chatbot

Let’s Play

Fine-Tuning the Mistral Model for Question-Answering

Dataset Playground

Key Hyperparameters

Beyond the Basics

GitHub

9 Cloud Computing Trends Shaping India’s Digital Future in 2025

LoRA fine-tune Gemma 7B Using TIR with 10 Easy Steps

How Does RAG Improve the Accuracy of LLM Responses?

Top 10 Cloud GPU Providers in 2025

What is Retrieval-Augmented Generation (RAG)?

AI Inference vs Training: Understanding Key Differences

Sovereign Cloud: India's Key to Digital Independence in the AI Age

E2E Sovereign Cloud Platform: Revolutionizing Cloud Sovereignty

Top 8 Generative AI Applications in 2025

A Comparison between TIR Containerized VMs vs Traditional VMs