Spaces:
Sleeping
Sleeping
ASL Recognition App: Image Preprocessing and Feature Extraction
This document explains the image preprocessing and feature extraction techniques used in our ASL Recognition App.
Image Preprocessing
- Image Loading: We use OpenCV (cv2) to load and process images.
- Color Conversion: Images are converted from BGR to RGB color space for compatibility with MediaPipe.
Hand Landmark Detection
We use MediaPipe Hands for detecting hand landmarks in the images:
- MediaPipe Hands: Initializes with
static_image_mode=True
,max_num_hands=1
, andmin_detection_confidence=0.5
. - Landmark Extraction: 21 hand landmarks are extracted from each image.
Feature Extraction
Landmark Normalization
- Centering: Landmarks are centered by subtracting the mean position.
- Scaling: Centered landmarks are scaled to unit variance.
Angle Calculation
We calculate angles between all pairs of landmarks:
- Vector Calculation: For each pair of landmarks (i, j), we calculate the vector from i to j.
- Angle Computation: We compute the arcosine of the x and y components of the normalized vector.
- Feature Vector: The angles form a 420-dimensional feature vector (21 choose 2 = 210 pairs, 2 angles per pair).
Model Input
The preprocessed features (angles) are used as input to our Random Forest model for ASL sign classification.
This preprocessing pipeline ensures that our model receives consistent and informative features, regardless of the hand's position or size in the original image.