How can you use LSTM for Stock Market Prediction?
Long Short Term Memory model has a great number of advantages that make it among the highly preferred models for sequential prediction. In this article, I hope to help you understand how the stock market data for any company can be predicted using a few simple lines of code.
Stock market prediction refers to the analysis of what a company’s future stock market standing will look like based on the data for that company to date.
The task of stock market prediction is not essentially an easy task because it is impossible to know if the future market behaves in the same manner as the market has till now.
It can be affected by natural factors, the uprise of a competitor, or any unforeseen events. And to model that can consider all such possibilities, will result in very poor prediction.
On the other hand, performing this task manually will be highly exhausting and time-consuming. Thus, the data science industry is highly reliable on such prediction models that help analyze the company risks, market predictions, etc.
LSTM (Long Short Term Memory) is a highly reliable model that considers long term dependencies as well as identifies the necessary information out of the entire available dataset. It is generally used for time-series based analysis such as sentiment analysis, stock market prediction, etc.
Now, diving straight into predicting the stock market for Tata Beverages. To do this, you can either use the cloud platform provided by Google; Google Colaboratory, or local services like Jupyter Notebook. So, let’s get started.
Step 1: Import the librariesJust like any other Python program we first import all the necessary libraries such as NumPy, Pandas, SciKitlearn, MatPlotLib, and Keras. These libraries help us read our dataset corpus, help visualize output as a graph, build the LSTM model, and other similar operations.
Step 2: Read the dataset: Now, we start building our model by first defining the dataset that the model will train on. We use the dataset provided by the National Stock Exchange of India: Tata Global Dataset, and read it using the ‘read_csv’ function by the Pandas library. To understand the structure of our dataset, we output the first 5 instances from the dataset using the head() method.
The output looks like this:
Step 3: Plot close data
To now visualize the analysis of the stock market standing to date, we plot the closing data using the following code:
The output obtained from it looks like this:
Step 4: Sort date and close
The next which we need to do is sort the required data, that is, that dates and its corresponding closing data. We perform an ascending sort on the date and arrange its corresponding closing data accordingly. These are the only two pieces of information that are necessary for actual prediction, thus, we neglect the rest.
Step 5: Datatype conversion
We do this to convert the data type of the information stored in the date variable from ‘timestamp’ to ‘’float’ because we cannot perform prediction on a timestamp data.
Step 6: Normalization
This is a necessary step for a lot of machine learning algorithms where the input data is expected to have different scalings. So, using normalization, the entire range of values is converted to range between 0 and 1. And we divide the dataset into a ratio of 80:20 for train and test respectively.
Step 7: LSTM build
Our dataset is uniform and clean, so we move forward to building and training the actual LSTM model that is going to perform the predictions. For this, we use the Adam Optimizer over others due to its ability to converge towards maxima easily and efficiently, and the error function as means squared error because it performs efficiently even in the presence of noise and outliers.
Step 8: Testing a sample
On successful completion of the previous step, our model is said to be trained on the dataset. So, a sample is now to be tested on the model. For this, we first get hold of the test dataset and extract an instance.
Step 9: Name the model
The model built in the previous steps is saved with the name ‘saved_model.h5’.
Step 10: Perform prediction
Now that our entire model is ready, we test the model on our test dataset. What I have done here is plotted the original dataset and the predicted stock market values together for years that fall under the 20% of the test dataset, which for me was after the year 2018, that the model was not trained on.
The result obtained looks like this:
Fig: Predicted value compared with the original values.
From the output, it can be seen that the predicted values and original values (by green and orange colored lines) almost overlap each other. This indicates the high accuracy of the model that we built using the above steps.
It was observed how LSTM can be used from the Keras library to predict the future standing of the stock market values of a company. This is done for a wide variety of reasons including the strategy formulation for a company, risk analysis, target audience to look for, etc.
This is so cool 💯✅
Nice tutorial, however I am confused at step 8. How did you got this test data and how can I change it to predict the future values for 2020 and 2021 till now so as to compare it with original data.
I really want to know. Please help