Building a Chatbot in Python
Have you ever thought of building your chatbot? If you have, then this project is going to be your first step in the innovative field of chatbots. A chatbot is a software application that can interact and communicate effectively with people just like humans. Chatbots use natural language recognition capabilities to recognize what a user is saying, and respond accordingly to their inquiries and requests.
About the Project
We are going to build a Retrieval based Chatbot with the help of python libraries, NLTK, Keras.
Retrieval based chatbots work on predefined inputs and responses. This chatbot will be a desktop app built with the Tkinter library. We will use a JSON file to store the data(different intents; their patterns and responses). We will use recurrent neural networks (LSTM) to classify(basically identify) the class/category to which the input message(by user) belongs and provide a random response from the existing list of responses.
Complete code for this project can be found on this github repository.
These are the files that are required for our complete project:
- Intents.json – This JSON file stores the data for our chatbot.
- Train_chatbot.py– This is the main python file where the model is trained.
- Words. pkl – This file stores the preprocessed words.
- Classes. pkl – This file stores the lists of categories.
- Chatbot_model.h5 – The model we trained in “Train_chatbot.py” is saved here.
- Chatbot_gui.py – This file contains a python script for the GUI of our chatbot.
- Image(optional) for the icon.
- Good command of python(functions and loops)
- Knowledge of Keras and NLTK.
- Basic knowledge of GUI library- Tkinter.
Set up the environment:
It is advised to create a virtual environment before doing any new project.
We need some packages to install for this project. You can install the package by typing :
pip install package_name
Where packages to be installed are TensorFlow, Keras, pickle, nltk.
Note: I prefer using vs code in the conda environment, but you can use any IDE/software of your choice. You can check the environment setup for vs code from my last project.
Since the environment is all ready to use. Let’s get started.
For any machine learning or deep learning project, the important step is to gather the data. So, our first step is to create a data file. For this project, our data will be stored in an “intents.json” file.
This file will contain the data in the form of:
- Tags: this is the category or class to which our patterns and responses belong.
- Patterns: In simple words, it is the combination of some words, which the user can ask a bot. (Input)
- Responses: Based on the patterns, what response we can expect from a bot, is stored here. (Output)
So, we need to first decide what is the purpose of this bot. For example, in this project, I am creating a Chatbot which is like a “Delhi-tourist-guide”. It can suggest tourist places, cafes/restaurants, and places to stay(based on predefined data). I have provided the data of 5-5 cafes, hotels, and tourist places.
Create a new file named “Intents.json”. We need to create different Tags for this purpose. Let’s start with creating some basic intents like “Greeting” and “goodbye” tags.
Creating an intents file is simple, just a rough sketch in your mind, what questions as a user you would like to ask to a bot, and what answers you expect from the bot. I guess by now you would have understood how to put data into intents files. If you already have a purpose for building your chatbot then go ahead and fill up your data accordingly but if not you can refer to my chatbot data. The complete code can be found on the “intents.json” file in the above-mentioned GitHub repository.
Create a new file named “train_chatbot.py”. Refer to the “train_chatbot.py” file from the above-mentioned GitHub repository for complete code.
- Importing necessary libraries
- Loading data
As our data is in JSON format, we’ll need to parse our “intents.json” into Python language. This can be done using the JSON package(we have already imported it).
- Data preprocessing
Data preprocessing is an important step while building any machine learning or deep learning model. Before going further, let us understand the techniques which we are going to use.
Tokenization is the process of breaking the sentences into words(referred to as tokens)
Lemmatization is the process of getting the base word(referred to as lemma) from the given input. For example: going, goes, went all three of these will return the word “go” after lemmatization.
- Now we will iterate through the patterns and tokenize them.
- The words(tokens) needed to be stored in a list. So we create a list to store words. (I created a list named “words”)
- Based on these words from the pattern, its “tag” will be identified. So we create a list to store these tags. (I created a list named “classes”)
- Now we will lemmatize the words stored in our “words” list.
- Remove the duplicates from these two lists.
- Store these two lists in the pickled form.
Creating training and testing data
The computer doesn’t understand the text, so we can’t train it with texts. Therefore we will convert text data into numbers.
- We use Keras sequential API to build a deep neural network that has 3 layers.
- Compile this Keras model with an SGD optimizer.
- Fit the model(I trained my model for 200 epochs)
- Save the model into h5 format.
As of now, we have built our model, so the next step is prediction. Create a new file named “Chatbot_gui.py”. Refer to the “chatbot_gui.py” file from the above-mentioned GitHub repository for complete code.
Importing necessary libraries
Loading our data
- Preprocessing the input
Input given by the user in the chatbot should be in the same manner as our model is trained on. Therefore we do similar text-preprocessing here also by tokenization and lemmatization. We are creating a function for this here.
We will create a function that can translate the user’s message(sentences) into the bag of words(array which contains 0 and 1 values). When this function finds a word from the sentence in chatbot vocabulary, it sets 1 into the corresponding position within the array. This array is going to be sent to be classified by the model to spot to what intent it belongs.
- Getting the response from the intents
Creating functions that can get a random response from the responses list from the
- Creating a GUI (Graphical User Interface)
We are going to use the Tkinter library to create a GUI for our chatbot. This will allow a window to take input from the user and then the output can be shown on the screen with the help of functions we have created.
Run the Chatbot
As our chatbot is ready, let us run it.
- Train the model by running the below command:
- You will notice the model training in the terminal as shown in the image below.
- If no error occurs in the process, then you have created our chatbot.
- Run the below command to see the chatbot on your screen:
- You will notice the GUI window appears on the screen, now you can chat with your bot.
Congratulations! You have created your first chatbot. The main purpose of this article was to make you learn how chatbots can be created with python libraries. Try to build your chatbot for different purposes as I have built for Delhi tourist Guide. You can make your chatbot look more attractive by making changes in the GUI. So, this was all for this article. If you find it helpful, do upvote.
Thanks for reading!