Media Pipe is a cross-platform (Android, ios,web) framework used to build Machine Learning pipelines for audio, video, time-series data etc. MediaPipe is used by many internal Google products and teams including: Nest, Gmail, Lens, Maps, Android Auto, Photos, Google Home, and YouTube.
From Wired to Wireless, Keyboards to Touch Screen, offline to online we came a long way. The way we communicate with computer devices has drastically changed with Face Recognition, Speech Recognition, Touch Screens and many more. All this is just because of the rapid developments in Technology. Today, we are using many AI/Machine Learning technologies in our daily life.
Likewise, Hand Gestures is also a way to communicate with computers for various reasons. We can use this application in various fields like Augmented Reality, Handicapped, Play Station Games, Car Dashboard, Smart TV’s nowadays uses gestures to operate etc.
MediaPipe Hand is a machine-learning employed high-fidelity hand and finger tracking solution. It detects 21 Landmark points as shown in Fig. are recorded from a hand in a single frame with the help of multiple models which are working simultaneously.
Mediapipe Hands consists of two different models working together namely Palm Detection Model in which a full image is identified and it draws a box around the hand, and Hand Landmark Model operates on this boxed image formed by Palm Detector and provides high fidelity 2D hand keypoint coordinates. (As shown in above fig.)
- mediapipe 0.8.1
- OpenCV 3.4.2 or Later
- Tensorflow 2.3.0 or Later
- tf-nightly 2.5.0.dev or later
- scikit-learn 0.23.2 or Later
- matplotlib 3.3.2 or Later
Let’s Jump in Building the model….
The goal is to recognize all the 26 Alphabets using Hand Gestures through a web camera.