josedolot commited on
Commit
66ad178
·
1 Parent(s): 6aa1df0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -399
README.md CHANGED
@@ -1,404 +1,13 @@
1
  ---
2
- title: {{HybridNet-Demo}}
3
- emoji: {{emoji}}
4
- colorFrom: {{colorFrom}}
5
- colorTo: {{colorTo}}
6
- sdk: {{sdk}}
7
- sdk_version: {{sdkVersion}}
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- # HybridNets: End2End Perception Network
13
-
14
-
15
- <div align="center">
16
-
17
- ![logo](images/hybridnets.jpg)
18
- **HybridNets Network Architecture.**
19
-
20
- [![Generic badge](https://img.shields.io/badge/License-MIT-<COLOR>.svg?style=for-the-badge)](https://github.com/datvuthanh/HybridNets/blob/main/LICENSE)
21
- [![PyTorch - Version](https://img.shields.io/badge/PYTORCH-1.10+-red?style=for-the-badge&logo=pytorch)](https://pytorch.org/get-started/locally/)
22
- [![Python - Version](https://img.shields.io/badge/PYTHON-3.7+-red?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/downloads/)
23
- <br>
24
- <!-- [![Contributors][contributors-shield]][contributors-url]
25
- [![Forks][forks-shield]][forks-url]
26
- [![Stargazers][stars-shield]][stars-url]
27
- [![Issues][issues-shield]][issues-url] -->
28
-
29
- </div>
30
-
31
- > [**HybridNets: End-to-End Perception Network**](https://arxiv.org/abs/2203.09035)
32
- >
33
- > by Dat Vu, Bao Ngo, [Hung Phan](https://scholar.google.com/citations?user=V3paQH8AAAAJ&hl=vi&oi=ao)<sup> :email:</sup> [*FPT University*](https://uni.fpt.edu.vn/en-US/Default.aspx)
34
- >
35
- > (<sup>:email:</sup>) corresponding author.
36
- >
37
- > *arXiv technical report ([arXiv 2203.09035](https://arxiv.org/abs/2203.09035))*
38
-
39
- [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hybridnets-end-to-end-perception-network-1/traffic-object-detection-on-bdd100k)](https://paperswithcode.com/sota/traffic-object-detection-on-bdd100k?p=hybridnets-end-to-end-perception-network-1)
40
- [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hybridnets-end-to-end-perception-network-1/lane-detection-on-bdd100k)](https://paperswithcode.com/sota/lane-detection-on-bdd100k?p=hybridnets-end-to-end-perception-network-1)
41
-
42
- <!-- TABLE OF CONTENTS -->
43
- <details>
44
- <summary>Table of Contents</summary>
45
- <ol>
46
- <li>
47
- <a href="#about-the-project">About The Project</a>
48
- <ul>
49
- <li><a href="#project-structure">Project Structure</a></li>
50
- </ul>
51
- </li>
52
- <li>
53
- <a href="#getting-started">Getting Started</a>
54
- <ul>
55
- <li><a href="#installation">Installation</a></li>
56
- <li><a href="#demo">Demo</a></li>
57
- </ul>
58
- </li>
59
- <li>
60
- <a href="#usage">Usage</a>
61
- <ul>
62
- <li><a href="#data-preparation">Data Preparation</a></li>
63
- <li><a href="#training">Training</a></li>
64
- </ul>
65
- </li>
66
- <li><a href="#training-tips">Training Tips</a></li>
67
- <li><a href="#results">Results</a></li>
68
- <li><a href="#license">License</a></li>
69
- <li><a href="#acknowledgements">Acknowledgements</a></li>
70
- <li><a href="#citation">Citation</a></li>
71
- </ol>
72
- </details>
73
-
74
-
75
- ## About The Project
76
- <!-- #### <div align=center> **HybridNets** = **real-time** :stopwatch: * **state-of-the-art** :1st_place_medal: * (traffic object detection + drivable area segmentation + lane line detection) :motorway: </div> -->
77
- HybridNets is an end2end perception network for multi-tasks. Our work focused on traffic object detection, drivable area segmentation and lane detection. HybridNets can run real-time on embedded systems, and obtains SOTA Object Detection, Lane Detection on BDD100K Dataset.
78
- ![intro](images/intro.jpg)
79
-
80
- ### Project Structure
81
- ```bash
82
- HybridNets
83
- │ backbone.py # Model configuration
84
- │ hubconf.py # Pytorch Hub entrypoint
85
- │ hybridnets_test.py # Image inference
86
- │ hybridnets_test_videos.py # Video inference
87
- │ train.py # Train script
88
- │ val.py # Validate script
89
-
90
- ├───encoders # https://github.com/qubvel/segmentation_models.pytorch/tree/master/segmentation_models_pytorch/encoders
91
- │ ...
92
-
93
- ├───hybridnets
94
- │ autoanchor.py # Generate new anchors by k-means
95
- │ dataset.py # BDD100K dataset
96
- │ loss.py # Focal, tversky (dice)
97
- │ model.py # Model blocks
98
-
99
- ├───projects
100
- │ bdd100k.yml # Project configuration
101
-
102
- └───utils
103
- │ plot.py # Draw bounding box
104
- │ smp_metrics.py # https://github.com/qubvel/segmentation_models.pytorch/blob/master/segmentation_models_pytorch/metrics/functional.py
105
- │ utils.py # Various helper functions (preprocess, postprocess, eval...)
106
-
107
- └───sync_batchnorm # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch/tree/master/sync_batchnorm
108
- ...
109
- ```
110
-
111
- ## Getting Started [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Uc1ZPoPeh-lAhPQ1CloiVUsOIRAVOGWA?usp=sharing)
112
- ### Installation
113
- The project was developed with [**Python>=3.7**](https://www.python.org/downloads/) and [**Pytorch>=1.10**](https://pytorch.org/get-started/locally/).
114
- ```bash
115
- git clone https://github.com/datvuthanh/HybridNets
116
- cd HybridNets
117
- pip install -r requirements.txt
118
- ```
119
-
120
- ### Demo
121
- ```bash
122
- # Download end-to-end weights
123
- mkdir weights
124
- curl -L -o weights/hybridnets.pth https://github.com/datvuthanh/HybridNets/releases/download/v1.0/hybridnets.pth
125
-
126
- # Image inference
127
- python hybridnets_test.py -w weights/hybridnets.pth --source demo/image --output demo_result --imshow False --imwrite True
128
-
129
- # Video inference
130
- python hybridnets_test_videos.py -w weights/hybridnets.pth --source demo/video --output demo_result
131
-
132
- # Result is saved in a new folder called demo_result
133
- ```
134
-
135
- ## Usage
136
- ### Data Preparation
137
- Recommended dataset structure:
138
- ```bash
139
- HybridNets
140
- └───datasets
141
- ├───imgs
142
- │ ├───train
143
- │ └───val
144
- ├───det_annot
145
- │ ├───train
146
- │ └───val
147
- ├───da_seg_annot
148
- │ ├───train
149
- │ └───val
150
- └───ll_seg_annot
151
- ├───train
152
- └───val
153
- ```
154
- Update your dataset paths in `projects/your_project_name.yml`.
155
-
156
- For BDD100K: [imgs](https://bdd-data.berkeley.edu/), [det_annot](https://drive.google.com/file/d/19CEnZzgLXNNYh1wCvUlNi8UfiBkxVRH0/view), [da_seg_annot](https://drive.google.com/file/d/1NZM-xqJJYZ3bADgLCdrFOa5Vlen3JlkZ/view), [ll_seg_annot](https://drive.google.com/file/d/1o-XpIvHJq0TVUrwlwiMGzwP1CtFsfQ6t/view)
157
-
158
- ### Training
159
- #### 1) Edit or create a new project configuration, using bdd100k.yml as a template
160
- ```python
161
- # mean and std of dataset in RGB order
162
- mean: [0.485, 0.456, 0.406]
163
- std: [0.229, 0.224, 0.225]
164
-
165
- # bdd100k anchors
166
- anchors_scales: '[2**0, 2**0.70, 2**1.32]'
167
- anchors_ratios: '[(0.62, 1.58), (1.0, 1.0), (1.58, 0.62)]'
168
-
169
- # must match your dataset's category_id.
170
- # category_id is one_indexed,
171
- # for example, index of 'car' here is 0, while category_id is 1
172
- obj_list: ['car']
173
-
174
- seg_list: ['road',
175
- 'lane']
176
-
177
- dataset:
178
- color_rgb: false
179
- dataroot: path/to/imgs
180
- labelroot: path/to/det_annot
181
- laneroot: path/to/ll_seg_annot
182
- maskroot: path/to/da_seg_annot
183
- ...
184
- ```
185
-
186
- #### 2) Train
187
- ```bash
188
- python train.py -p bdd100k # your_project_name
189
- -c 3 # coefficient of effnet backbone, result from paper is 3
190
- -n 4 # num_workers
191
- -b 8 # batch_size per gpu
192
- -w path/to/weight # use 'last' to resume training from previous session
193
- --freeze_det # freeze detection head, others: --freeze_backbone, --freeze_seg
194
- --lr 1e-5 # learning rate
195
- --optim adamw # adamw | sgd
196
- --num_epochs 200
197
- ```
198
- Please check `python train.py --help` for every available arguments.
199
-
200
- #### 3) Evaluate
201
- ```bash
202
- python val.py -p bdd100k -c 3 -w checkpoints/weight.pth
203
- ```
204
-
205
- ## Training Tips
206
- ### Anchors :anchor:
207
- If your dataset is intrinsically different from COCO or BDD100K, or the metrics of detection after training are not as high as expected, you could try enabling autoanchor in `project.yml`:
208
- ```python
209
- ...
210
- model:
211
- image_size:
212
- - 640
213
- - 384
214
- need_autoanchor: true # set to true to run autoanchor
215
- pin_memory: false
216
- ...
217
- ```
218
- This automatically finds the best combination of anchor scales and anchor ratios for your dataset. Then you can manually edit them `project.yml` and disable autoanchor.
219
-
220
- If you're feeling lucky, maybe mess around with base_anchor_scale in `backbone.py`:
221
- ```python
222
- class HybridNetsBackbone(nn.Module):
223
- ...
224
- self.pyramid_levels = [5, 5, 5, 5, 5, 5, 5, 5, 6]
225
- self.anchor_scale = [1.25,1.25,1.25,1.25,1.25,1.25,1.25,1.25,1.25,]
226
- self.aspect_ratios = kwargs.get('ratios', [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)])
227
- ...
228
- ```
229
- and `model.py`:
230
- ```python
231
- class Anchors(nn.Module):
232
- ...
233
- for scale, ratio in itertools.product(self.scales, self.ratios):
234
- base_anchor_size = self.anchor_scale * stride * scale
235
- anchor_size_x_2 = base_anchor_size * ratio[0] / 2.0
236
- anchor_size_y_2 = base_anchor_size * ratio[1] / 2.0
237
- ...
238
- ```
239
- to get a grasp on how anchor boxes work.
240
-
241
- And because a picture is worth a thousand words, you can visualize your anchor boxes in [Anchor Computation Tool](https://github.com/Cli98/anchor_computation_tool).
242
- ### Training stages
243
- We experimented with training stages and found that this settings achieved the best results:
244
-
245
- 1. `--freeze_seg True` ~ 100 epochs
246
- 2. `--freeze_backbone True --freeze_det True` ~ 50 epochs
247
- 3. Train end-to-end ~ 50 epochs
248
-
249
- The reason being detection head is harder to converge early on, so we basically skipped segmentation head to focus on detection first.
250
-
251
- ## Results
252
- ### Traffic Object Detection
253
-
254
- <table>
255
- <tr><th>Result </th><th>Visualization</th></tr>
256
- <tr><td>
257
-
258
- | Model | Recall (%) | [email protected] (%) |
259
- |:------------------:|:------------:|:---------------:|
260
- | `MultiNet` | 81.3 | 60.2 |
261
- | `DLT-Net` | 89.4 | 68.4 |
262
- | `Faster R-CNN` | 77.2 | 55.6 |
263
- | `YOLOv5s` | 86.8 | 77.2 |
264
- | `YOLOP` | 89.2 | 76.5 |
265
- | **`HybridNets`** | **92.8** | **77.3** |
266
-
267
- </td><td>
268
-
269
- <img src="images/det1.jpg" width="50%" /><img src="images/det2.jpg" width="50%" />
270
-
271
- </td></tr> </table>
272
-
273
- <!--
274
- | Model | Recall (%) | [email protected] (%) |
275
- |:------------------:|:------------:|:---------------:|
276
- | `MultiNet` | 81.3 | 60.2 |
277
- | `DLT-Net` | 89.4 | 68.4 |
278
- | `Faster R-CNN` | 77.2 | 55.6 |
279
- | `YOLOv5s` | 86.8 | 77.2 |
280
- | `YOLOP` | 89.2 | 76.5 |
281
- | **`HybridNets`** | **92.8** | **77.3** |
282
-
283
- <p align="middle">
284
- <img src="images/det1.jpg" width="49%" />
285
- <img src="images/det2.jpg" width="49%" />
286
- </p>
287
-
288
- -->
289
-
290
- ### Drivable Area Segmentation
291
-
292
- <table>
293
- <tr><th>Result </th><th>Visualization</th></tr>
294
- <tr><td>
295
-
296
- | Model | Drivable mIoU (%) |
297
- |:----------------:|:-----------------:|
298
- | `MultiNet` | 71.6 |
299
- | `DLT-Net` | 71.3 |
300
- | `PSPNet` | 89.6 |
301
- | `YOLOP` | 91.5 |
302
- | **`HybridNets`** | **90.5** |
303
-
304
- </td><td>
305
-
306
- <img src="images/road1.jpg" width="50%" /><img src="images/road2.jpg" width="50%" />
307
-
308
- </td></tr> </table>
309
-
310
- <!--
311
- | Model | Drivable mIoU (%) |
312
- |:----------------:|:-----------------:|
313
- | `MultiNet` | 71.6 |
314
- | `DLT-Net` | 71.3 |
315
- | `PSPNet` | 89.6 |
316
- | `YOLOP` | 91.5 |
317
- | **`HybridNets`** | **90.5** |
318
- <p align="middle">
319
- <img src="images/road1.jpg" width="49%" />
320
- <img src="images/road2.jpg" width="49%" />
321
- </p>
322
- -->
323
-
324
- ### Lane Line Detection
325
-
326
- <table>
327
- <tr><th>Result </th><th>Visualization</th></tr>
328
- <tr><td>
329
-
330
- | Model | Accuracy (%) | Lane Line IoU (%) |
331
- |:----------------:|:------------:|:-----------------:|
332
- | `Enet` | 34.12 | 14.64 |
333
- | `SCNN` | 35.79 | 15.84 |
334
- | `Enet-SAD` | 36.56 | 16.02 |
335
- | `YOLOP` | 70.5 | 26.2 |
336
- | **`HybridNets`** | **85.4** | **31.6** |
337
-
338
- </td><td>
339
-
340
- <img src="images/lane1.jpg" width="50%" /><img src="images/lane2.jpg" width="50%" />
341
-
342
- </td></tr> </table>
343
-
344
- <!--
345
- | Model | Accuracy (%) | Lane Line IoU (%) |
346
- |:----------------:|:------------:|:-----------------:|
347
- | `Enet` | 34.12 | 14.64 |
348
- | `SCNN` | 35.79 | 15.84 |
349
- | `Enet-SAD` | 36.56 | 16.02 |
350
- | `YOLOP` | 70.5 | 26.2 |
351
- | **`HybridNets`** | **85.4** | **31.6** |
352
-
353
- <p align="middle">
354
- <img src="images/lane1.jpg" width="49%" />
355
- <img src="images/lane2.jpg" width="49%" />
356
- </p>
357
- -->
358
- <div align="center">
359
-
360
- ![](images/full_video.gif)
361
-
362
- [Original footage](https://www.youtube.com/watch?v=lx4yA1LEi9c) courtesy of [Hanoi Life](https://www.youtube.com/channel/UChT1Cpf_URepCpsdIqjsDHQ)
363
-
364
- </div>
365
-
366
- ## License
367
-
368
- Distributed under the MIT License. See `LICENSE` for more information.
369
-
370
- ## Acknowledgements
371
-
372
- Our work would not be complete without the wonderful work of the following authors:
373
-
374
- * [EfficientDet](https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch)
375
- * [YOLOv5](https://github.com/ultralytics/yolov5)
376
- * [YOLOP](https://github.com/hustvl/YOLOP)
377
- * [KMeans Anchors Ratios](https://github.com/mnslarcher/kmeans-anchors-ratios)
378
- * [Anchor Computation Tool](https://github.com/Cli98/anchor_computation_tool)
379
-
380
- ## Citation
381
-
382
- If you find our paper and code useful for your research, please consider giving a star :star: and citation :pencil: :
383
-
384
- ```BibTeX
385
- @misc{vu2022hybridnets,
386
- title={HybridNets: End-to-End Perception Network},
387
- author={Dat Vu and Bao Ngo and Hung Phan},
388
- year={2022},
389
- eprint={2203.09035},
390
- archivePrefix={arXiv},
391
- primaryClass={cs.CV}
392
- }
393
- ```
394
-
395
- <!-- MARKDOWN LINKS & IMAGES -->
396
- <!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
397
- [contributors-shield]: https://img.shields.io/github/contributors/othneildrew/Best-README-Template.svg?style=for-the-badge
398
- [contributors-url]: https://github.com/datvuthanh/HybridNets/graphs/contributors
399
- [forks-shield]: https://img.shields.io/github/forks/othneildrew/Best-README-Template.svg?style=for-the-badge
400
- [forks-url]: https://github.com/datvuthanh/HybridNets/network/members
401
- [stars-shield]: https://img.shields.io/github/stars/othneildrew/Best-README-Template.svg?style=for-the-badge
402
- [stars-url]: https://github.com/datvuthanh/HybridNets/stargazers
403
- [issues-shield]: https://img.shields.io/github/issues/othneildrew/Best-README-Template.svg?style=for-the-badge
404
- [issues-url]: https://github.com/datvuthanh/HybridNets/issues
 
1
  ---
2
+ title: HybridNet_Demo
3
+ emoji: 💩
4
+ colorFrom: yellow
5
+ colorTo: red
6
+ sdk: gradio
7
+ sdk_version: 2.8.14
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
  ---
12
 
13
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference