Introduction#
💡 Quick Links:
- Chinook Database for MySQL: Chinook_MySql.sql
- Chinook Database for SQLite: Chinook_Sqlite.sql
In this tutorial, we will learn how to chat with a MySQL (or SQLite) database using Python and LangChain. We will use the LangChain wrapper of sqlalchemy
to interact with the database. We will also use the langchain
package to create a custom chain that will allow us to chat with the database using natural language.
As you can see in the diagram above, we will first create a SQL chain that will generate SQL queries based on the user’s input. We will then create a LangChain chain that will allow us to chat with the database using natural language. We will use the langchain
package to create both chains.
Prerequisites#
Before we start, make sure you have the following installed:
Setting Up the Test Database#
In this section, we will set up the database with which we will be interacting. If you already have the database that you wish to use, you can skip this section.
First of all, let’s download the Chinook database. This is a sample database that represents a digital media store, including tables for artists, albums, media tracks, invoices, and customers. We will use this database to test our chatbot.
For easy access, you can download the latest SQL file from this link.
Now, let’s set up the database. We will use both SQLite and MySQL to demonstrate how to chat with a database using Python and LangChain. I am including both databases because SQLite is easy to set up and use, while MySQL is widely used in production.
SQLite#
In this section, we will set up the SQLite database. SQLite is a simple and lightweight database that is easy to set up and use. It is very useful for testing and development purposes.
Installing SQLite#
If you are running Mac or Linux, you should already have SQLite installed.
If you are running Windows, you can download the precompiled binaries from the SQLite download page. For you Windows users, you can follow this tutorial by Amit to install it on Windows.
Creating a Database#
To create a new SQLite database, go to your root directory where you will be creating your program and run the following command:
sqlite3 chinook.db
Now you have a new SQLite database called chinook.db
and you are in the SQLite shell. You can run the following commands:
.read Chinook.sql --or the name of your SQL file to load the database
SELECT * FROM albums; --to test the database
You should see a list of albums from the albums
table.
We will use this database to test our LangChain chain in a moment. For now, let’s set up the MySQL database.
MySQL#
In this section, we will set up the MySQL database in your local environment so that we can chat with it using Python and LangChain. MySQL is a popular open-source relational database management system. It is widely used in web development and is known for its speed and reliability.
Installing MySQL#
You can download MySQL from the official website. If you want a detailed tutorial on how to install MySQL on your machine, you can check the following articles:
If you If you are using a Mac, you can also install it using Homebrew:
# to install MySQL
brew install mysql
# to start the MySQL service
brew services start mysql
# to check the status of the MySQL service
brew services list
Now set up the root password using the following command:
mysql_secure_installation
Creating a Database#
Now, go to your terminal and log in to MySQL using the following command:
mysql -u root -p
You will be prompted to enter your password. Once you are logged in, you can create a new database using the following command:
CREATE DATABASE chinook;
USE chinook;
SOURCE chinook.sql; --or the name of your SQL file to load the database
And test the database by running the following command:
SELECT * FROM albums LIMIT 10;
You should see a list of albums from the albums
table.
Create the LangChain Chain#
Now that we have set up the database, let’s create a LangChain chain that will allow us to chat with the database using natural language.
Install the Required Packages#
First, let’s install the packages we need:
pip install langchain mysql-connector-python
Load the Database#
IMPORTANT: When using a real database, you should never use any user with WRITE permissions in an application like this one. Always use a user with READ permissions only and limit its scope. Otherwise, you might expose your database to SQL injection attacks.
Before anything else, we need to load the database. Langchain comes with a convenient loader for SQL databases:
from langchain_community.utilities import SQLDatabase
# if you are using SQLite
sqlite_uri = 'sqlite:///./Chinook.db'
# if you are using MySQL
mysql_uri = 'mysql+mysqlconnector://root:admin@localhost:3306/test_db'
db = SQLDatabase.from_uri(sqlite_uri)
Remeber that if you are using MySQL, you need to add the mysql-connector-python
package to your environment.
Create a SQL Chain#
We will use a LLM to generate the SQL query based on the user’s input. There are two pieces of information that we need to provide to the LLM: the database schema and the user’s input.
Let’s create a prompt template that takes these two variables as input:
from langchain_core.prompts import ChatPromptTemplate
template = """Based on the table schema below, write a SQL query that would answer the user's question:
{schema}
Question: {question}
SQL Query:"""
prompt = ChatPromptTemplate.from_template(template)
Getting the question is straightforward. But we still need to generate the schema from the loaded database. We can do this by using the get_table_info
method of the SQLDatabase
class. Let’s create a function that will work as a tool to generate the schema:
def get_schema(db):
schema = db.get_table_info()
return schema
Now let’s pass this function as runnable to our new SQL chain:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
sql_chain = (
RunnablePassthrough.assign(schema=get_schema)
| prompt
| llm.bind(stop=["\nSQLResult:"])
| StrOutputParser()
)
There you go. Let’s test the SQL chain:
user_question = 'how many albums are there in the database?'
sql_chain.invoke({"question": user_question})
# 'SELECT COUNT(*) AS TotalAlbums\nFROM Album;'
Create the full Chain#
Now that we have the SQL chain, we can create the full chain that will allow us to chat with the database using natural language. We will need to create a new prompt template that takes:
- the SQL query generated by the SQL chain
- the response from the database to the SQL query
- the user’s input
template = """Based on the table schema below, question, sql query, and sql response, write a natural language response:
{schema}
Question: {question}
SQL Query: {query}
SQL Response: {response}"""
prompt_response = ChatPromptTemplate.from_template(template)
And just like before, we need to create a function (a tool) that will run the SQL query and get the response from the database:
def run_query(query):
return db.run(query)
Now let’s pass this function as runnable to our new LangChain chain:
full_chain = (
RunnablePassthrough.assign(query=sql_chain).assign(
schema=get_schema,
response=lambda vars: run_query(vars["query"]),
)
| prompt_response
| model
)
There you go. Let’s test the full chain:
user_question = 'how many albums are there in the database?'
full_chain.invoke({"question": user_question})
# 'There are 347 albums in the database.'
Conclusion#
In this tutorial, we learned how to chat with a MySQL (or SQLite) database using Python and LangChain. We used the LangChain wrapper of sqlalchemy
to interact with the database. We also used the langchain
package to create a custom chain that allowed us to chat with the database using natural language. We hope you found this tutorial helpful. If you have any questions or comments, please let us know. Thank you for reading!