File size: 3,782 Bytes
cb6b726
 
 
 
 
 
 
 
 
 
 
d4276de
 
 
cb6b726
e779b1d
cb6b726
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e779b1d
cb6b726
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68da5b8
cb6b726
e779b1d
cb6b726
 
68da5b8
 
 
 
 
 
cb6b726
 
 
 
 
 
 
68da5b8
 
d4276de
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
license: apache-2.0
---

# SynthPose (MMPose HRNet48+DarkPose variant)

The SynthPose model was proposed in [OpenCapBench: A Benchmark to Bridge Pose Estimation and Biomechanics](https://arxiv.org/abs/2406.09788) by Yoni Gozlan, Antoine Falisse, Scott Uhlrich, Anthony Gatti, Michael Black, Akshay Chaudhari. 

# Intended use cases

This model uses DarkPose with an HRNet backbone.
SynthPose is a new approach that enables finetuning of pre-trained 2D human pose models to predict an arbitrarily denser set of keypoints for accurate kinematic analysis through the use of synthetic data.  
More details are available in [OpenCapBench: A Benchmark to Bridge Pose Estimation and Biomechanics](https://arxiv.org/abs/2406.09788).  
This particular variant was finetuned on a set of keypoints usually found on motion capture setups, and include coco keypoints as well.

The model predicts the following 52 markers:

```
[
    'nose',
    'left_eye',
    'right_eye',
    'left_ear',
    'right_ear',
    'left_shoulder',
    'right_shoulder',
    'left_elbow',
    'right_elbow',
    'left_wrist',
    'right_wrist',
    'left_hip',
    'right_hip',
    'left_knee',
    'right_knee',
    'left_ankle',
    'right_ankle',
    'sternum',
    'rshoulder',
    'lshoulder',
    'r_lelbow',
    'l_lelbow',
    'r_melbow',
    'l_melbow',
    'r_lwrist',
    'l_lwrist',
    'r_mwrist',
    'l_mwrist',
    'r_ASIS',
    'l_ASIS',
    'r_PSIS',
    'l_PSIS',
    'r_knee',
    'l_knee',
    'r_mknee',
    'l_mknee',
    'r_ankle',
    'l_ankle',
    'r_mankle',
    'l_mankle',
    'r_5meta',
    'l_5meta',
    'r_toe',
    'l_toe',
    'r_big_toe',
    'l_big_toe',
    'l_calc',
    'r_calc',
    'C7',
    'L2',
    'T11',
    'T6',
]
```
Where the first 17 keypoints are the COCO keypoints, and the next 35 are anatomical markers.

# Usage

## Installation
This implementation is based on [MMPose](https://mmpose.readthedocs.io/en/latest/).
MMpose requires torch, and the installation process is the following:
```bash
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"
```

## Image inference

Here's how to load the model and run inference on an image:

```python
from huggingface_hub import snapshot_download
from mmpose.apis import MMPoseInferencer

snapshot_download(repo_id="yonigozlan/synthpose-hrnet-48-mmpose", local_dir="./synthpose-hrnet-48-mmpose")
inferencer = MMPoseInferencer(
    pose2d='./synthpose-hrnet-48-mmpose/td-hm_hrnet-w48_dark-8xb32-210e_synthpose_inference.py',
    pose2d_weights='./synthpose-hrnet-48-mmpose/hrnet-w48_dark.pth'
)

url = "http://farm4.staticflickr.com/3300/3416216247_f9c6dfc939_z.jpg"
result_generator = inferencer([url], pred_out_dir='predictions', vis_out_dir='visualizations')
results = next(result_generator)
```

The following visualization will be saved:
<p>
<img src="inference_example.jpg" width=375>
</p>
Where the keypoints part of the skeleton are the COCO keypoints, and the pink ones the anatomical markers.

## Video inference

To run inference on a video, simply replace the last two lines with 

```python
result_generator = inferencer("football.mp4", pred_out_dir='predictions', vis_out_dir='visualizations')
results = [result for result in result_generator]
```

## Training

Finetuning a model using SynthPose can be done by adapting the `td-hm_hrnet-w48_dark-8xb32-210e_merge_bedlam_infinity_coco_3DPW_eval_rich-384x288_pretrained.py` config on the following [MMPose fork](https://github.com/yonigozlan/mmpose).  
To create annotations on a synthetic dataset (such as BEDLAM) using SynthPose, the tools present in [this repository](https://github.com/yonigozlan/OpenCapBench/tree/main/synthpose) can be used (better documentation to come).