|
# Tutorial 7: Finetuning Models |
|
|
|
Detectors pre-trained on the COCO dataset can serve as a good pre-trained model for other datasets, e.g., CityScapes and KITTI Dataset. |
|
This tutorial provides instruction for users to use the models provided in the [Model Zoo](../model_zoo.md) for other datasets to obtain better performance. |
|
|
|
There are two steps to finetune a model on a new dataset. |
|
|
|
- Add support for the new dataset following [Tutorial 2: Customize Datasets](customize_dataset.md). |
|
- Modify the configs as will be discussed in this tutorial. |
|
|
|
Take the finetuning process on Cityscapes Dataset as an example, the users need to modify five parts in the config. |
|
|
|
## Inherit base configs |
|
|
|
To release the burden and reduce bugs in writing the whole configs, MMDetection V2.0 support inheriting configs from multiple existing configs. To finetune a Mask RCNN model, the new config needs to inherit |
|
`_base_/models/mask_rcnn_r50_fpn.py` to build the basic structure of the model. To use the Cityscapes Dataset, the new config can also simply inherit `_base_/datasets/cityscapes_instance.py`. For runtime settings such as training schedules, the new config needs to inherit `_base_/default_runtime.py`. This configs are in the `configs` directory and the users can also choose to write the whole contents rather than use inheritance. |
|
|
|
```python |
|
_base_ = [ |
|
'../_base_/models/mask_rcnn_r50_fpn.py', |
|
'../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py' |
|
] |
|
``` |
|
|
|
## Modify head |
|
|
|
Then the new config needs to modify the head according to the class numbers of the new datasets. By only changing `num_classes` in the roi_head, the weights of the pre-trained models are mostly reused except the final prediction head. |
|
|
|
```python |
|
model = dict( |
|
pretrained=None, |
|
roi_head=dict( |
|
bbox_head=dict( |
|
type='Shared2FCBBoxHead', |
|
in_channels=256, |
|
fc_out_channels=1024, |
|
roi_feat_size=7, |
|
num_classes=8, |
|
bbox_coder=dict( |
|
type='DeltaXYWHBBoxCoder', |
|
target_means=[0., 0., 0., 0.], |
|
target_stds=[0.1, 0.1, 0.2, 0.2]), |
|
reg_class_agnostic=False, |
|
loss_cls=dict( |
|
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), |
|
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), |
|
mask_head=dict( |
|
type='FCNMaskHead', |
|
num_convs=4, |
|
in_channels=256, |
|
conv_out_channels=256, |
|
num_classes=8, |
|
loss_mask=dict( |
|
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)))) |
|
``` |
|
|
|
## Modify dataset |
|
|
|
The users may also need to prepare the dataset and write the configs about dataset. MMDetection V2.0 already support VOC, WIDER FACE, COCO and Cityscapes Dataset. |
|
|
|
## Modify training schedule |
|
|
|
The finetuning hyperparameters vary from the default schedule. It usually requires smaller learning rate and less training epochs |
|
|
|
```python |
|
# optimizer |
|
# lr is set for a batch size of 8 |
|
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) |
|
optimizer_config = dict(grad_clip=None) |
|
# learning policy |
|
lr_config = dict( |
|
policy='step', |
|
warmup='linear', |
|
warmup_iters=500, |
|
warmup_ratio=0.001, |
|
# [7] yields higher performance than [6] |
|
step=[7]) |
|
total_epochs = 8 # actual epoch = 8 * 8 = 64 |
|
log_config = dict(interval=100) |
|
``` |
|
|
|
## Use pre-trained model |
|
|
|
To use the pre-trained model, the new config add the link of pre-trained models in the `load_from`. The users might need to download the model weights before training to avoid the download time during training. |
|
|
|
```python |
|
load_from = 'http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth' # noqa |
|
|
|
``` |
|
|