ASL Recognition App: Image Preprocessing and Feature Extraction

This document explains the image preprocessing and feature extraction techniques used in our ASL Recognition App.

Image Preprocessing

Image Loading: We use OpenCV (cv2) to load and process images.
Color Conversion: Images are converted from BGR to RGB color space for compatibility with MediaPipe.

Hand Landmark Detection

We use MediaPipe Hands for detecting hand landmarks in the images:

MediaPipe Hands: Initializes with static_image_mode=True, max_num_hands=1, and min_detection_confidence=0.5.
Landmark Extraction: 21 hand landmarks are extracted from each image.

Feature Extraction

Landmark Normalization

Centering: Landmarks are centered by subtracting the mean position.
Scaling: Centered landmarks are scaled to unit variance.

Angle Calculation

We calculate angles between all pairs of landmarks:

Vector Calculation: For each pair of landmarks (i, j), we calculate the vector from i to j.
Angle Computation: We compute the arcosine of the x and y components of the normalized vector.
Feature Vector: The angles form a 420-dimensional feature vector (21 choose 2 = 210 pairs, 2 angles per pair).

Model Input

The preprocessed features (angles) are used as input to our Random Forest model for ASL sign classification.

This preprocessing pipeline ensures that our model receives consistent and informative features, regardless of the hand's position or size in the original image.