Every state or city has specific laws or regulations that residents of that city may not always be aware of. As we create a smarter future through smart cities, making city laws and regulations accessible to everyone is important. The ability to easily understand local laws and regulations empowers residents and creates transparency.
In this article, we explore how AI can be leveraged to create a chatbot that allows residents to ask questions about their city. This can help them learn about laws and regulations effortlessly, and discover aspects of their city they would otherwise have been unaware of.
While this chatbot has been built around city laws, you can very well use the method described below to do the same for your college, university, or organization.
Also, we will take this opportunity to explore the powerful new framework DSPy and the capabilities it has in simplifying prompt engineering language models like Llama 3.
Approach
The goal, as described above, is to build a chatbot that residents of a city can use to understand the local laws and regulations. For this, we will use the latest and the most cutting edge open source LLM - Llama 3.
Along with Llama 3, we will use the trending framework DSPy, which is being called the next big thing since LangChain by AI experts. DSPy is changing the paradigm of how we interact with language models by eliminating the need for manual prompting. Using ‘signatures’ and ‘modules’ DSPy auto-generates prompts, and has built in capabilities like dspy.ChainOfThought, dspy.ProgramOfThought, dspy.ReAct etc. We will use dspy.ChainOfThought in this article.
We will, of course, use E2E Cloud’s top AI-first infrastructure to build this.
So, our stack will be the following:
- LLM: Llama3-8B
- Framework: DSPy
- UI: Gradio
- Platform: E2E Cloud
Also, we will use the following dataset – but you can replace this with the one that’s useful for the city you are building this for.
Dataset: https://www.mha.gov.in/sites/default/files/DMC-Act-1957_0.pdf
The PDF above contains laws outlined in ‘The Delhi Municipal Corporation Act, 1957’, and is hosted by the Ministry of Home Affairs.
Guide to Building the Chatbot Using DSPy, Llama 3 and Chroma DB on E2E Cloud
First, register and launch a GPU node on E2E Cloud. You will need to add your SSH key in the process, in order to access the node. A V100 node should be good enough but, if you want faster inference, use A100 or H100.
Let’s install Ollama:
You can deploy Llama 3 using Ollama easily:
Next, set up a Python virtual environment. You can use any method you prefer, or use Conda.
Once the virtual environment has been initialized, you can install the following libraries:
Next, let’s instantiate the Chroma DB vector store.
The above code stores the vector database in the db/ folder. Also, we are going to be using the model all-MiniLM-L6-v2 to create embeddings. Using the embedding_functions utility by Chroma DB would help us generate embeddings when inserting the documents in the vector store.
Next, let’s load the document, split it into pages, and then prepare it for insertion in the vector store.
This uses the LangChain utility function to load and split the document.
We have now created two arrays – docs and ids. We can call the ChromaDB function to add documents to the vector store and generate embeddings on the fly.
We now have a collection named city_laws which contains our document pages and their respective embeddings.
Run local Ollama Llama 3:
Let’s set up the DSPy module.
Now that DSPy has been configured, you can create a RAG module very easily:
The way DSPy works is, you provide a signature which is then used to create modules. In this case, we are using the inline signature: “context, question -> answer”. To read more about DSPy signatures, visit here.
Now, we can already test our RAG() module.
This will give the following results:
Output:
Works well! Let’s add a Gradio UI on top:
Results
We can now see the results on the Gradio interface:
Optimizing the Program
With DSPy you can improve (‘optimize’) your program so that it gives accurate results. To do so, you first need to create a training dataset. Below, that would be saved in trainset. We will assume that you have created a training dataset.
You should also create a metric function, which will be used to evaluate the program output. It will look something like this:
Now, you can use the compiled_rag function to return results which are far more in line with what you expect your application to give.
Conclusion
As demonstrated in this guide, we created a chatbot using Llama 3, DSPy, Gradio and E2E Cloud that allows residents of a city to chat and understand the bylaws and regulations of their city. You can use this to build a similar chatbot for your city, your college, your university, or your organization.
Use E2E Cloud’s high-end cloud GPUs to get the best performance out of your chatbot. To talk to our sales team, connect with us at sales@e2enetworks.com.