The classic dino game, except its controlled by your hand!
This project highlights the intersection of Game Development and Computer Vision technologies, specifically real-time Object Detection.
The game utilizes your webcam to capture your gestures in real-time. An object-detection model is used to classify your gestures, which make the dino jump.
- Pygame (to build the game interface)
- TensorFlow (to train the object detection model)
- OpenCV (processing and labelling images).
I made a YouTube video showing how the game works. Check it out 👇
- The Object Detection model is trained on two classes: a closed hand and an open palm.
- When you open your palm, this triggers the dino to jump. To jump again, you need to first reset the dino by closing your hand.
The project comprised of three main processes:
- Image collection and annotation (labelling)
- Object Detection Model Training
- Game Integration and Testing
I think I repeated these processes around 5 times before I eventually found the optimal solution. The most difficult part of this project was finding the right balance between speed and accuracy.
- Initially training images were scraped from the web, however these were often poor quality images.
- Labelling these images was an arduous task, although this was very much relieved with the extremely useful LabelImg tool.
- After determining that the project was feasible, I wrote an image collection script to collect images from my webcam. This was largely facilitated by the OpenCV library.
- Eventually I curated a dataset which contained high-quality webcam images, which is ultimately what powered the performance of the Object Detection Model.
Training the Object Detection Model was undoubtedly the crux of the project. Each training session took around 3-5 hours.
This part of the project was particularly challenging and insightful, as the model's performance depends on so many factors:
- Data quality
- Model architecture
- Number of training epochs
Model architectures were sourced from TensorFlow 2 Detection Model Zoo, which open sources a large range of pretrained object detection architectures ready to be fine-tuned.
I tested 2 model architectures for this project:
- CenterNet MobileNetV2 FPN 512x512 (fast, but extremely inaccurate)
- SSD MobileNet V2 FPNLite 320x320 (relatively slow, decent real-time accuracy)
After 5 trials, I settled on the SSD MobileNet architecture. Although it was slower than the CenterNet architecture, the CenterNet architecture was too simple to fulfil the object detection task.
The game interface was built using Python (it's incredible that I was able to use a single programming language for the entire project) and the amazing game library PyGame.
The game is comprised of two components:
- Camera feed, which was handled by OpenCV.
- Game assets, which are rendered by PyGame.
Integrating the model into the game proved to be rather difficult. The FPS had to carefully tuned so that:
- The game was smooth and playable.
- The model wasn't overburdened, which would have slowed down the game even further.
Also, it was completely infeasible to load every single frame into the object detection model. Thus, the camera feed input was fed periodically into the Object Detection Model, ensuring that the game didn't become slow.
I highly encourage you to dive into the code. Here's a quick navigation guide.
The two main folders are game
and Tensorflow
:
game
contains the files relating to the dino game implementation such as scripts, classes and assets.Tensorflow
contains all the code relating to image annotation and model development.
-
game/assets
: Contains the images used in the game such as the dino sprite, cactus sprite and restart button icon. -
game/utilities
: Contains various helper functions used in the game. -
game/Button.py
,game/Obstacle.py
,game/Player.py
: PyGame Classes. -
game/main.py
: Runs the game.
-
TensorFlow/models
: Cloned repository which contains all the code relating to Tensorflow model development including the object detection API. -
TensorFlow/scripts
: Various scripts used in the model development process. -
TensorFlow/workspace
: Model development zone; organised into iterations; each iteration contains a README.md file which explains the contents of the iteration folder.