Creating Chrome Dragon Game Using Your Body & Computer Vision

5+
Sometimes, when the internet goes out, a dinosaur game appears on the google chrome home page, but the game seems kind of boring because all you have to do is to press the spacebar to make the dinosaur jump, but what if we could control the dinosaur by only using our body? That’s what I did using the python language, and I’ll show you how in this post!

Creating the dinosaur chrome game in python from scratch

The first thing I did was to recreate the dinosaur game using the pygame library available in python. Recreating the entire game from scratch in python allows us to manipulate the game more easily, especially when we add the machine learning model and the computer vision features.

Comparison between the original game and its clone made in python

Collecting the images to train the machine learning algorithm

As we are going to create a Machine Learning (from now on I’ll abbreviate as ML) model that classifies images, we will need to collect some images to train the ML model, in other words, we need images to teach the model how to classify an image.

To collect the images, I’ve created a python script that captures an image (technically, a frame) from my webcam every 250 milliseconds. But not only that, but the script also processes these images before saving them in the chosen directory, converting them to shades of gray and applying an image filter called Canny Edge Detector, available in the OpenCV library.

Result of the image processing

Image processing is important because it’ll reduce the dimensions of the image and helps the ML algorithm to learn much faster. By doing that, we’ll also be getting rid of unnecessary information that images may contain. 

For image capture and processing, I used the OpenCV library. The script is down below:

Every 2 seconds, I made a jump in front of the webcam while the images were being collected, processed, and saved in a directory, and after approximately 2 minutes, I stopped the image capture.

Images being saved during the image capture

As you can imagine, 2 types of the image were saved in the directory, the ones that I am jumping, to and the ones that I am not jumping. As the ML model that we will choose is supervised (which means the algorithm needs a previous classification to learn), we need to separate the images into 2 directories, one called “jump” and another called “no_jump”.

The directory called “jump” will only receive the images in which I’m jumping, while the “no_jump” directory, will only receive the images in which I’m not jumping. The separation process is done manually.

Separation of the images into 2 different directories

Preparing the data

 Before we train the ML model, let’s apply a final image processing, I chose to reduce the width and the height of all images from 640 and 480 pixels to 128 and 96 pixels, respectively, in order to increase the learning speed of the ML algorithm.

As the images are represented as matrices in OpenCV, it’s ok to say that our images have 128 columns and 96 rows. If we want to pass the images to the ML algorithm, we need to change the matrices dimensions to 12288 columns and 1 row, where 12288 is the result of 128 * 96. This change is necessary because that is how the ML algorithm “sees” an image (the NumPy module was used to make this change). 

Preparing the data

With everything ready, now we just need to choose which ML algorithm to use. As we only have a few images and many columns (technically, we call “attributes” in ML) per image, the Support Vector Machine algorithm or SVM, seemed the most suitable.

To train the ML model, I used the sci-kit-learn library, all we have to do is pass all the images and their respective labels (hence the importance of separating the images in different directories) to the SVM training method.

The data preparation and the ML model creation code is down below

Code:

Saving the trained model and exporting it inside the game script

Now that we have our ML model, we must save it in the same directory where the game script is located. To save the ML model, we need to import the joblib library and call the dump method, the file that contains the model has the extension .pkl.

Adding the trained model inside the game

Opening the game script, we need to import the trained ML model inside the game using the joblib library again, but this time, we’ll use the load method.

Creating a real time frame capture program into the game script

Yet in the game script, as we need to capture the images from the webcam in real-time during the game, I just created a function that captures, processes, and passes frame by frame from the webcam to the trained ML model in real-time. If the ML model believes that I’m jumping according to the frame received, it will return 1 as an answer, otherwise, it’ll return 0. 

Modifying the Dinosaur’s jumping condition

Instead of having to press the space key on the keyboard to make the dinosaur jumps, I used the ML model answer to do that, every time the ML model returns 1, the dinosaur jumps, otherwise, it won’t jump. 

The game code is down below:

Code: 

And Voilà! Here’s the final result:

GIF

VIDEO LINK

Although it seems accurate, as the ML model was trained based on my images, when someone else plays the game, the ML model may be confused and return wrong values. Another problem that may occur is related to the background, as I used a single-color background and the background had no objects fixed on it, the ML model probably will be confused if there are things behind while playing the game.

These problems could have been overcome by collecting more images, with different people jumping and with different backgrounds.

The Github repository can be accessed through this link: 

https://github.com/joaotinti75/Pygame/tree/master/Chrome_Dinosaur_Game

And that’s it, thank you very much for reading, if you have any question, send me a request on LinkedIn and I’ll be happy to answer ☺

My LinkedIn: https://www.linkedin.com/in/joaotinti75/

My Github: https://github.com/joaotinti75

close
5+

You may also like...

1 Response

  1. November 24, 2020

    […] analytics: Image analytics (also known as “computer vision”) refers to the process of deriving information from visual data and converting this into […]

    0

Leave a Reply

Your email address will not be published. Required fields are marked *

DMCA.com Protection Status