RAG Vs Fine-Tuning: How to Optimize LLM Performance

January 16, 2024

Large language Models have revolutionized natural language processing tasks, by showcasing impressive capabilities in understanding and generating human-like text. However, the dynamic landscape of information poses challenges in adapting these models to new knowledge effectively. Two prominent approaches, Retrieval Augmented Generation (RAG) and Fine-Tuning, have emerged as contenders in addressing this challenge. 

Let’s delve into the variances between these strategies, exploring their strengths, weaknesses, and implications for enhancing the knowledge base of LLMs.

Challenges Faced by LLMs

Before delving into the solutions, it's crucial to understand the challenges LLMs face. Factual errors often arise due to various factors such as domain knowledge deficits, outdated information, disremembering, forgetting, and reasoning failures. The pre-training phase, where models learn from vast datasets, plays a pivotal role in shaping their knowledge foundation. Despite the extensive training, these models may falter when faced with new or specific information that is not present in their training data.

Factual Errors: The Pitfalls of Imperfect Information

Factual errors in language models are inaccuracies or discrepancies between the information generated by the model and the facts. These errors can arise from various sources and contribute to the model's inability to provide reliable and accurate information. Understanding the nature of these errors is essential for mitigating their impact on model performance.

  1. Domain Knowledge Deficits: Language models are trained on vast datasets that cover a broad range of topics. However, they may lack in-depth knowledge in specific domains, which leads to inaccuracies when generating content in those areas. For instance, a model trained on general knowledge may struggle with detailed medical or scientific information.
  1. Outdated Information: Models may not be aware of recent developments or changes, especially if their training data has a cutoff date. This can result in the dissemination of outdated information, which affects the model's reliability in dynamic fields.
  1. Disremembering: Despite extensive training, models may fail to memorize specific facts or details. This disremembering can manifest as the model could generate incorrect information or fail to recall essential details.
  1. Forgetting: Over time, language models may forget certain information from their training data. This memory decay can lead to factual errors, especially when dealing with infrequently encountered facts.
  1. Reasoning Failures: Language models may struggle with complex reasoning tasks that require a deep understanding of context and relationships between different pieces of information. This can result in errors when answering questions or providing explanations.

Knowledge Deficits: Addressing the Gaps in Information

Knowledge deficits go hand-in-hand with factual errors and refer to the areas where language models lack the necessary information. While models may exhibit general knowledge prowess, addressing these deficits is crucial for ensuring their adaptability to new and specialized knowledge domains.

  1. Limited Training Data Coverage: Language models may not have encountered sufficient examples or instances related to specific topics during their training. This limited coverage can lead to knowledge deficits, particularly in niche or emerging fields.
  1. Inadequate Pre-training: The quality of pre-training data significantly influences the knowledge base of language models. Inadequate pre-training on diverse and representative datasets may result in models with inherent knowledge gaps.
  1. Lack of Exposure to Varied Contexts: Models trained on monolithic datasets may lack exposure to diverse contexts and perspectives. This narrow exposure can contribute to knowledge deficits, especially when dealing with multifaceted or culturally nuanced information.
  1. Implicit Bias: Language models may inadvertently carry biases present in their training data. This bias can lead to skewed or incomplete representations of certain topics, which contributes to knowledge deficits and inaccuracies.

Knowledge Injection: Fine-Tuning vs. Retrieval Augmented Generation


Fine-tuning is a process that involves adjusting a pre-trained language model on a more specific dataset or task to improve its performance within that particular domain. This approach is particularly useful when models need to be adapted to new or specialized knowledge areas. Fine-tuning can be classified into several types, each with its unique characteristics:

  1. Supervised Fine-Tuning: In supervised fine-tuning, the model is trained on labeled input-output pairs. This method often involves presenting the model with task descriptions in natural language and corresponding examples of the desired behavior. While effective for improving overall model quality, supervised fine-tuning may not necessarily impart new knowledge to the model.
  1. Reinforcement Learning Fine-Tuning: Another form of fine-tuning leverages reinforcement learning or RL-inspired optimization strategies. Techniques such as reinforcement learning from human feedback, direct preference optimization, and proximal policy optimization are examples of RL-based fine-tuning. These methods focus on improving the overall quality and expected behavior of the model but may not specifically address knowledge breadth.
  1. Unsupervised Fine-Tuning: Unsupervised fine-tuning, also known as continual pre-training or unstructured fine-tuning, involves continuing the pre-training phase of the language model in a causal auto-regressive manner. This method capitalizes on the vast knowledge stored during the initial pre-training. Unsupervised fine-tuning aims to inject new knowledge into the model and is often preferred for its efficacy in learning new information.

Fine-tuning, while a valuable tool, comes with challenges such as instability and potential impacts on the model's broader capabilities. Careful consideration of the fine-tuning strategy and its alignment with specific goals is crucial to achieving optimal results.

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) represents a paradigm shift in enhancing language models by incorporating external knowledge sources. This technique is particularly beneficial for knowledge-intensive tasks and involves the following key steps:

  1. Knowledge Base Creation: RAG begins with the creation of an auxiliary knowledge base. Relevant information is gathered from the data provided to create a knowledge base. This knowledge base serves as a repository of information that can augment the model's understanding. 
  1. Embedding and Retrieval: Given an input query, RAG employs an embedding model to represent both the query and the documents in the knowledge base. Retrieval involves finding documents within the knowledge base that resemble the input query. This process is typically facilitated by vector representations of documents.
  1. Context Enrichment: Retrieved documents are added to the input query, which enriches the model's context about the subject. This augmented context enhances the model's ability to generate more informed and contextually relevant responses.

RAG has proven effective in addressing both factual errors and knowledge deficits. It allows language models to go beyond their pre-training knowledge and leverage external information for improved performance in various tasks.

Choosing between Fine-Tuning and RAG: Considerations and Trade-Offs

The decision to employ fine-tuning or RAG depends on the specific goals of a task and the nature of the knowledge required. Here are some considerations and trade-offs:

  1. Fine-tuning Considerations: Fine-tuning is suitable for tasks where specific, task-oriented improvements are needed. It is effective for refining a model's performance in a particular domain. However, fine-tuning may exhibit instability and might not be the optimal choice for addressing broad knowledge deficits.
  1. RAG Considerations: RAG excels in knowledge-intensive tasks where external information is valuable which is provided by feeding data to the knowledge base. It can address both knowledge deficits and factual errors by incorporating diverse knowledge from external sources. RAG's effectiveness relies on the quality and coverage of the knowledge base.
  1. Trade-offs: Fine-tuning may provide more control over specific task-related improvements, but it might struggle with broader knowledge adaptation. RAG, while powerful in leveraging external knowledge, depends on the availability and reliability of the knowledge base.

Utilizing E2E Cloud GPU for Implementing Knowledge Injection

Knowledge Injection is a process that enhances machine learning models by integrating external knowledge which is advantageous while dealing with limited or unrepresentative training data. There are several benefits when we use Cloud GPUs. Firstly, the scalability of cloud GPUs allows for a seamless transition from a single GPU for development to multiple GPUs for training larger models or handling extensive datasets. Additionally, the cost-effectiveness of cloud GPUs offers a more economical alternative to maintaining dedicated hardware, especially for irregular usage patterns where charges are incurred only for the resources utilized. 

Let’s walk through an example of fine-tuning an LLM vs. the RAG approach and compare their performance with evaluation metrics. To perform this, I used E2E Cloud GPU A100 80 GB with CUDA 11. To learn more about E2E Cloud GPUs, visit the website

To get started, add your SSH keys by going into Settings.

After creating SSH keys, create a node by going into ‘Compute’.

Now, open your Visual Studio Code and download the extension ‘Remote Explorer’ as well as ‘Remote SSH’. Open a new terminal and login into your local system.

ssh root@

You’ll be logged in remotely with SSH on your local system.

Fine-Tuning LLM Example: Code Implementation

Let’s understand the scenario by code implementation with the same dataset for both approaches.

First, install the dependencies that we are going to use in this implementation.

%pip install -q peft==0.4.0
%pip install -q transformers
%pip install -q datasets
%pip install -q huggingface_hub
%pip install -q evaluate
%pip install -q seqeval
%pip install -q langchain
%pip install -q cohere
%pip install -q pinecone-client
%pip install -q ragas

Import the classes and functions that we are going to need in this implementation.

from transformers import AutoTokenizer
from datasets import load_dataset
from peft import get_peft_model, PromptTuningConfig, TaskType, PromptTuningInit, PeftModel, LoraConfig
from huggingface_hub import notebook_login
from langchain_community.embeddings import CohereEmbeddings
from langchain.vectorstores import Pinecone
from langchain_community.document_loaders import HuggingFaceDatasetLoader
from tqdm import tqdm
from seqeval.metrics import classification_report, f1_score
import pinecone
from pipeline import HuggingFacePipeline
from huggingface_hub import notebook_login
from datasets import load_dataset
from transformers import pipeline
from transformers.pipelines import Processor, QuestionAnsweringPipeline
from sklearn.metrics import average_precision_score, recall_score
import getpass

In this implementation, we will use Flan T5 XXL, which you can find here. We’ll tokenize the model and make it our foundation model.

# Specify the pre-trained model name you want to use
model_name = "google/flan-t5-xxl"

# Load the tokenizer associated with the pre-trained model
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the pre-trained causal language model using the specified model name
foundation_model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

Load the dataset, which can be found here. We’ll map the dataset to the tokenizer for one of the columns named system_prompt. We’ll select a range of datasets to work with.

# Load the "OpenOrca" dataset using the load_dataset function from the datasets library
data = load_dataset("Open-Orca/OpenOrca")

# Tokenize the dataset using the specified tokenizer
data = data.map(lambda samples: tokenizer(samples["system_prompt"]), batched=True)

# Select a subset of the training samples (first 50 samples in this case)
train_sample = data["train"].select(range(50))

# Display the selected subset of training samples

Let’s set the LoRA configuration for fine-tuning the foundation model.

lora_config = LoraConfig(
 target_modules=["q", "v"],

Using LoRA configuration, we will set the final PEFT model from the foundation model, and see the trainable parameters, all parameters, and the trainable percentage.

peft_model = get_peft_model(foundation_model, lora_config)

The following will be the result:

trainable params: 18,874,368 || all params: 11,154,206,720 || trainable%: 0.16921300163961817

Now, we will make a directory to store the PEFT model outputs.

%mkdir /root/working_dir

We’ll set the training arguments for our new fine-tuned model.

# Define the output directory for storing Peft model outputs
output_directory = os.path.join("/root/working_dir", "lora_outputs")

# Create the working directory if it doesn't exist
if not os.path.exists("/root/working_dir"):

# Create the output directory if it doesn't exist
if not os.path.exists(output_directory):

# Define training arguments for the Peft model
training_args = TrainingArguments(
    output_dir=output_directory,  # Where the model predictions and checkpoints will be written
    no_cuda=True,  # This is necessary for CPU clusters.
    auto_find_batch_size=True,  # Find a suitable batch size that will fit into memory automatically
    learning_rate=1e-3,  # Higher learning rate than full fine-tuning
    num_train_epochs=5  # Number of passes to go through the entire fine-tuning dataset

Then, we’ll train the model using the sample data and the Data Collator.

# Enable gradient checkpointing in the Peft model's configuration
peft_model.config.gradient_checkpointing = True

# Create a Trainer instance for training the Peft model
trainer = Trainer(
    model=peft_model,  # We pass in the PEFT version of the foundation model,
    args=training_args,  # Training arguments specifying output directory, GPU usage, batch size, etc.
    train_dataset=train_sample,  # Training dataset
    data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)  # mlm=False indicates not to use masked language modeling

# Start the training process

We’ll save the fine-tuned model using the output directory and the timestamp.

# Record the current time for creating a unique Peft model path
time_now = time.time()

# Create a path for saving the Peft model using the output directory and timestamp
peft_model_path = os.path.join(output_directory, f"peft_model_{time_now}")

# Save the trained Peft model to the specified path

Now, we'll load the trained PEFT model and use the foundation model for prompt tuning.

# Load the trained Peft model from the specified path using the PeftModel class
loaded_model = PeftModel.from_pretrained(
    foundation_model,  # The base model to be used for prompt tuning
    peft_model_path,   # The path where the trained Peft model is saved
    is_trainable=False  # Indicates that the loaded model should not be trainable

We’ll take a test set, and set a metric to evaluate the fine-tuned model.

test_dataset = data['train'].train_test_split(test_size=0.2)['test']

# Metric
metric = f1_score  # Replace with the actual metric you want to use

# Assuming 'loaded_model' and 'tokenizer' are already defined in your code

def get_labels_from_dataset(sample):
    if 'labels' in sample:
        # Replace -100 in the labels as we can't decode them.
        labels = np.where(sample['labels'] != -100, sample['labels'], tokenizer.pad_token_id)
        return tokenizer.decode(labels, skip_special_tokens=True).split()
        # Handle the case where 'labels' key is not present in the sample
        # You may need to adjust this based on your dataset structure
        return []

def evaluate_peft_model(sample, max_target_length=50):
    # Generate summary
    input_ids_tensor = torch.tensor([sample["input_ids"]])  # Add a batch dimension
    outputs = loaded_model.generate(input_ids=input_ids_tensor, do_sample=True, top_p=0.9, max_length=max_target_length)
    prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split()
    # Get labels from the dataset
    labels = get_labels_from_dataset(sample)

    # Some simple post-processing
    return prediction, labels

# Run predictions
# This can take some time
test_sample = test_dataset.select(range(10))
predictions, references = [], []
for sample in tqdm(test_sample):
    p, l = evaluate_peft_model(sample)
    if p and l:  # Skip empty predictions and references
        references.append([l])  # Wrap the reference in a list

# Check if there are non-empty predictions and references with the same length
if all(predictions) and all(references) and len(predictions) == len(references):
    # Compute metric
    results = metric(references, predictions)
    # Print results
    print(f"seqeval F1 score: {results*100:.2f}%")
    print("Inconsistent number of samples in predictions and references.")

From the evaluation we get the following results:

seqeval F1 score: 67.70%

The evaluation F1 score is 67.7%, which is not much good as we can see.

RAG Example: Code Implementation

For implementing the evaluation of the RAG approach, let’s start with the setting of the Cohere API key, which we’ll be using for the embeddings.

os.environ["COHERE_API_KEY"] = "your-api-key"

Let’s load the dataset again.

dataset_name = "Open-Orca/OpenOrca"
page_content_column = "system_prompt"

loader = HuggingFaceDatasetLoader(dataset_name, page_content_column)
data = loader.load()

We’ll take a sample of the data.

data_sample = data[:15]

We’ll split the data into chunks using the text splitter.

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)

docs = text_splitter.split_documents(data_sample)

Then, we’ll create the embeddings using the model “embed-english-light-v3.0”. Cohere has many models for different purposes, but here we are using this one.

embeddings = CohereEmbeddings(model="embed-english-light-v3.0")

Now, we’ll set the Pinecone API key and the environment.

os.environ["PINECONE_API_KEY"] = getpass.getpass("Pinecone API Key:")
os.environ["PINECONE_ENV"] = getpass.getpass("Pinecone Environment:")

After that, we’ll initialize the Pinecone set up.

# initialize pinecone
    api_key=os.getenv("PINECONE_API_KEY"),  # find at app.pinecone.io
    environment=os.getenv("PINECONE_ENV"),  # next to api key in console

index_name = "my_index"

We’ll create an index to store the embeddings.

# First, check if our index already exists. If it doesn't, we create it
if index_name not in pinecone.list_indexes():
    # we create a new index
    pinecone.create_index(name=index_name, metric="cosine", dimension=384)

Then, we will initiate the Pinecone with docs, embeddings, and index.

docsearch = Pinecone.from_documents(docs, embeddings, index_name=index_name)

We’ll load the model to pass the query.

# Specify the model name you want to use
model_name = "google/flan-t5-xxl"

# Load the tokenizer associated with the specified model
tokenizer = AutoTokenizer.from_pretrained(model_name, padding=True, truncation=True, max_length=512)

# Define a question-answering pipeline using the model and tokenizer
question_answerer = pipeline(

# Create an instance of the HuggingFacePipeline, which wraps the question-answering pipeline
# with additional model-specific arguments (temperature and max_length)
llm = HuggingFacePipeline(
   model_kwargs={"temperature": 0.7, "max_length": 512},

Then, we’ll create a retriever object, and pass the model in the RetrievalQA chain.

retriever = docsearch.as_retriever(search_kwargs={"k": 4})

# Create a question-answering instance (qa) using the RetrievalQA class.
# It's configured with a language model (llm), a chain type "refine," the retriever we created, and an option to not return source documents.
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="refine", retriever=retriever, return_source_documents=False)

Now, we’ll evaluate our RAG approach using RAGAS.

def map_to_squad_features(example):
    return {
        'question': example['question'],
        'contexts': example['system_prompt'],
        'answer': example['response'],
        # 'ground_truths' might need to be filled with appropriate data
        'ground_truths': None

# Apply the function to the dataset
squad_like_data = data_sample.map(map_to_squad_features)

result = evaluate(
    dataset = squad_like_data , 

df = result.to_pandas()

The following is the result:

The following will be the RAGAS score for each question:

You can see from the results that the ragas_score for each question is performing better than the fine-tuned model.

Comparative Knowledge Injection Observations

As we compared the evaluation metrics of the fine-tuned LLM model and the RAG model on the text classification use case, we saw that the ragas_score is performing better than the score of the fine-tuned model. Let’s see how the fine-tuned model and RAG perform with other use cases.

However, RAG use cases can be more powerful if we incorporate the fine-tuned model with it.


In the quest to enhance LLMs' knowledge bases, understanding the variances between retrieval augmented generation and fine-tuning is pivotal. While fine-tuning shows promise, especially when combined with RAG, the latter emerges as a more reliable choice for knowledge injection, as we saw in the examples.

Fine-tuning and Retrieval Augmented Generation represent powerful tools for refining language models, each with its unique strengths and considerations. The choice between these approaches often involves a balance between task specificity, model stability, and the need for external knowledge integration.

In many cases, a synergistic approach that combines aspects of both fine-tuning and RAG may offer a comprehensive solution as we saw in the comparative observations. Leveraging the strengths of fine-tuned models along with the context enrichment provided by external knowledge through RAG can contribute to the development of more robust and adaptable language models. 

Latest Blogs
This is a decorative image for: A Complete Guide To Customer Acquisition For Startups
October 18, 2022

A Complete Guide To Customer Acquisition For Startups

Any business is enlivened by its customers. Therefore, a strategy to constantly bring in new clients is an ongoing requirement. In this regard, having a proper customer acquisition strategy can be of great importance.

So, if you are just starting your business, or planning to expand it, read on to learn more about this concept.

The problem with customer acquisition

As an organization, when working in a diverse and competitive market like India, you need to have a well-defined customer acquisition strategy to attain success. However, this is where most startups struggle. Now, you may have a great product or service, but if you are not in the right place targeting the right demographic, you are not likely to get the results you want.

To resolve this, typically, companies invest, but if that is not channelized properly, it will be futile.

So, the best way out of this dilemma is to have a clear customer acquisition strategy in place.

How can you create the ideal customer acquisition strategy for your business?

  • Define what your goals are

You need to define your goals so that you can meet the revenue expectations you have for the current fiscal year. You need to find a value for the metrics –

  • MRR – Monthly recurring revenue, which tells you all the income that can be generated from all your income channels.
  • CLV – Customer lifetime value tells you how much a customer is willing to spend on your business during your mutual relationship duration.  
  • CAC – Customer acquisition costs, which tells how much your organization needs to spend to acquire customers constantly.
  • Churn rate – It tells you the rate at which customers stop doing business.

All these metrics tell you how well you will be able to grow your business and revenue.

  • Identify your ideal customers

You need to understand who your current customers are and who your target customers are. Once you are aware of your customer base, you can focus your energies in that direction and get the maximum sale of your products or services. You can also understand what your customers require through various analytics and markers and address them to leverage your products/services towards them.

  • Choose your channels for customer acquisition

How will you acquire customers who will eventually tell at what scale and at what rate you need to expand your business? You could market and sell your products on social media channels like Instagram, Facebook and YouTube, or invest in paid marketing like Google Ads. You need to develop a unique strategy for each of these channels. 

  • Communicate with your customers

If you know exactly what your customers have in mind, then you will be able to develop your customer strategy with a clear perspective in mind. You can do it through surveys or customer opinion forms, email contact forms, blog posts and social media posts. After that, you just need to measure the analytics, clearly understand the insights, and improve your strategy accordingly.

Combining these strategies with your long-term business plan will bring results. However, there will be challenges on the way, where you need to adapt as per the requirements to make the most of it. At the same time, introducing new technologies like AI and ML can also solve such issues easily. To learn more about the use of AI and ML and how they are transforming businesses, keep referring to the blog section of E2E Networks.

Reference Links




This is a decorative image for: Constructing 3D objects through Deep Learning
October 18, 2022

Image-based 3D Object Reconstruction State-of-the-Art and trends in the Deep Learning Era

3D reconstruction is one of the most complex issues of deep learning systems. There have been multiple types of research in this field, and almost everything has been tried on it — computer vision, computer graphics and machine learning, but to no avail. However, that has resulted in CNN or convolutional neural networks foraying into this field, which has yielded some success.

The Main Objective of the 3D Object Reconstruction

Developing this deep learning technology aims to infer the shape of 3D objects from 2D images. So, to conduct the experiment, you need the following:

  • Highly calibrated cameras that take a photograph of the image from various angles.
  • Large training datasets can predict the geometry of the object whose 3D image reconstruction needs to be done. These datasets can be collected from a database of images, or they can be collected and sampled from a video.

By using the apparatus and datasets, you will be able to proceed with the 3D reconstruction from 2D datasets.

State-of-the-art Technology Used by the Datasets for the Reconstruction of 3D Objects

The technology used for this purpose needs to stick to the following parameters:

  • Input

Training with the help of one or multiple RGB images, where the segmentation of the 3D ground truth needs to be done. It could be one image, multiple images or even a video stream.

The testing will also be done on the same parameters, which will also help to create a uniform, cluttered background, or both.

  • Output

The volumetric output will be done in both high and low resolution, and the surface output will be generated through parameterisation, template deformation and point cloud. Moreover, the direct and intermediate outputs will be calculated this way.

  • Network architecture used

The architecture used in training is 3D-VAE-GAN, which has an encoder and a decoder, with TL-Net and conditional GAN. At the same time, the testing architecture is 3D-VAE, which has an encoder and a decoder.

  • Training used

The degree of supervision used in 2D vs 3D supervision, weak supervision along with loss functions have to be included in this system. The training procedure is adversarial training with joint 2D and 3D embeddings. Also, the network architecture is extremely important for the speed and processing quality of the output images.

  • Practical applications and use cases

Volumetric representations and surface representations can do the reconstruction. Powerful computer systems need to be used for reconstruction.

Given below are some of the places where 3D Object Reconstruction Deep Learning Systems are used:

  • 3D reconstruction technology can be used in the Police Department for drawing the faces of criminals whose images have been procured from a crime site where their faces are not completely revealed.
  • It can be used for re-modelling ruins at ancient architectural sites. The rubble or the debris stubs of structures can be used to recreate the entire building structure and get an idea of how it looked in the past.
  • They can be used in plastic surgery where the organs, face, limbs or any other portion of the body has been damaged and needs to be rebuilt.
  • It can be used in airport security, where concealed shapes can be used for guessing whether a person is armed or is carrying explosives or not.
  • It can also help in completing DNA sequences.

So, if you are planning to implement this technology, then you can rent the required infrastructure from E2E Networks and avoid investing in it. And if you plan to learn more about such topics, then keep a tab on the blog section of the website

Reference Links



This is a decorative image for: Comprehensive Guide to Deep Q-Learning for Data Science Enthusiasts
October 18, 2022

A Comprehensive Guide To Deep Q-Learning For Data Science Enthusiasts

For all data science enthusiasts who would love to dig deep, we have composed a write-up about Q-Learning specifically for you all. Deep Q-Learning and Reinforcement learning (RL) are extremely popular these days. These two data science methodologies use Python libraries like TensorFlow 2 and openAI’s Gym environment.

So, read on to know more.

What is Deep Q-Learning?

Deep Q-Learning utilizes the principles of Q-learning, but instead of using the Q-table, it uses the neural network. The algorithm of deep Q-Learning uses the states as input and the optimal Q-value of every action possible as the output. The agent gathers and stores all the previous experiences in the memory of the trained tuple in the following order:

State> Next state> Action> Reward

The neural network training stability increases using a random batch of previous data by using the experience replay. Experience replay also means the previous experiences stocking, and the target network uses it for training and calculation of the Q-network and the predicted Q-Value. This neural network uses openAI Gym, which is provided by taxi-v3 environments.

Now, any understanding of Deep Q-Learning   is incomplete without talking about Reinforcement Learning.

What is Reinforcement Learning?

Reinforcement is a subsection of ML. This part of ML is related to the action in which an environmental agent participates in a reward-based system and uses Reinforcement Learning to maximize the rewards. Reinforcement Learning is a different technique from unsupervised learning or supervised learning because it does not require a supervised input/output pair. The number of corrections is also less, so it is a highly efficient technique.

Now, the understanding of reinforcement learning is incomplete without knowing about Markov Decision Process (MDP). MDP is involved with each state that has been presented in the results of the environment, derived from the state previously there. The information which composes both states is gathered and transferred to the decision process. The task of the chosen agent is to maximize the awards. The MDP optimizes the actions and helps construct the optimal policy.

For developing the MDP, you need to follow the Q-Learning Algorithm, which is an extremely important part of data science and machine learning.

What is Q-Learning Algorithm?

The process of Q-Learning is important for understanding the data from scratch. It involves defining the parameters, choosing the actions from the current state and also choosing the actions from the previous state and then developing a Q-table for maximizing the results or output rewards.

The 4 steps that are involved in Q-Learning:

  1. Initializing parameters – The RL (reinforcement learning) model learns the set of actions that the agent requires in the state, environment and time.
  2. Identifying current state – The model stores the prior records for optimal action definition for maximizing the results. For acting in the present state, the state needs to be identified and perform an action combination for it.
  3. Choosing the optimal action set and gaining the relevant experience – A Q-table is generated from the data with a set of specific states and actions, and the weight of this data is calculated for updating the Q-Table to the following step.
  4. Updating Q-table rewards and next state determination – After the relevant experience is gained and agents start getting environmental records. The reward amplitude helps to present the subsequent step.  

In case the Q-table size is huge, then the generation of the model is a time-consuming process. This situation requires Deep Q-learning.

Hopefully, this write-up has provided an outline of Deep Q-Learning and its related concepts. If you wish to learn more about such topics, then keep a tab on the blog section of the E2E Networks website.

Reference Links



This is a decorative image for: GAUDI: A Neural Architect for Immersive 3D Scene Generation
October 13, 2022

GAUDI: A Neural Architect for Immersive 3D Scene Generation

The evolution of artificial intelligence in the past decade has been staggering, and now the focus is shifting towards AI and ML systems to understand and generate 3D spaces. As a result, there has been extensive research on manipulating 3D generative models. In this regard, Apple’s AI and ML scientists have developed GAUDI, a method specifically for this job.

An introduction to GAUDI

The GAUDI 3D immersive technique founders named it after the famous architect Antoni Gaudi. This AI model takes the help of a camera pose decoder, which enables it to guess the possible camera angles of a scene. Hence, the decoder then makes it possible to predict the 3D canvas from almost every angle.

What does GAUDI do?

GAUDI can perform multiple functions –

  • The extensions of these generative models have a tremendous effect on ML and computer vision. Pragmatically, such models are highly useful. They are applied in model-based reinforcement learning and planning world models, SLAM is s, or 3D content creation.
  • Generative modelling for 3D objects has been used for generating scenes using graf, pigan, and gsn, which incorporate a GAN (Generative Adversarial Network). The generator codes radiance fields exclusively. Using the 3D space in the scene along with the camera pose generates the 3D image from that point. This point has a density scalar and RGB value for that specific point in 3D space. This can be done from a 2D camera view. It does this by imposing 3D datasets on those 2D shots. It isolates various objects and scenes and combines them to render a new scene altogether.
  • GAUDI also removes GANs pathologies like mode collapse and improved GAN.
  • GAUDI also uses this to train data on a canonical coordinate system. You can compare it by looking at the trajectory of the scenes.

How is GAUDI applied to the content?

The steps of application for GAUDI have been given below:

  • Each trajectory is created, which consists of a sequence of posed images (These images are from a 3D scene) encoded into a latent representation. This representation which has a radiance field or what we refer to as the 3D scene and the camera path is created in a disentangled way. The results are interpreted as free parameters. The problem is optimized by and formulation of a reconstruction objective.
  • This simple training process is then scaled to trajectories, thousands of them creating a large number of views. The model samples the radiance fields totally from the previous distribution that the model has learned.
  • The scenes are thus synthesized by interpolation within the hidden space.
  • The scaling of 3D scenes generates many scenes that contain thousands of images. During training, there is no issue related to canonical orientation or mode collapse.
  • A novel de-noising optimization technique is used to find hidden representations that collaborate in modelling the camera poses and the radiance field to create multiple datasets with state-of-the-art performance in generating 3D scenes by building a setup that uses images and text.

To conclude, GAUDI has more capabilities and can also be used for sampling various images and video datasets. Furthermore, this will make a foray into AR (augmented reality) and VR (virtual reality). With GAUDI in hand, the sky is only the limit in the field of media creation. So, if you enjoy reading about the latest development in the field of AI and ML, then keep a tab on the blog section of the E2E Networks website.

Reference Links




Build on the most powerful infrastructure cloud

A vector illustration of a tech city using latest cloud technologies & infrastructure