# Useful Tools
## Visualization Tools
### Dataset Visualization Tool
MMOCR provides a dataset visualization tool `tools/visualizations/browse_datasets.py` to help users troubleshoot possible dataset-related problems. You just need to specify the path to the training config (usually stored in `configs/textdet/dbnet/xxx.py`) or the dataset config (usually stored in `configs/textdet/_base_/datasets/xxx.py`), and the tool will automatically plots the transformed (or original) images and labels.
#### Usage
```bash
python tools/visualizations/browse_dataset.py \
${CONFIG_FILE} \
[-o, --output-dir ${OUTPUT_DIR}] \
[-p, --phase ${DATASET_PHASE}] \
[-m, --mode ${DISPLAY_MODE}] \
[-t, --task ${DATASET_TASK}] \
[-n, --show-number ${NUMBER_IMAGES_DISPLAY}] \
[-i, --show-interval ${SHOW_INTERRVAL}] \
[--cfg-options ${CFG_OPTIONS}]
```
| ARGS | Type | Description |
| ------------------- | ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| config | str | (required) Path to the config. |
| -o, --output-dir | str | If GUI is not available, specifying an output path to save the visualization results. |
| -p, --phase | str | Phase of dataset to visualize. Use "train", "test" or "val" if you just want to visualize the default split. It's also possible to be a dataset variable name, which might be useful when a dataset split has multiple variants in the config. |
| -m, --mode | `original`, `transformed`, `pipeline` | Display mode: display original pictures or transformed pictures or comparison pictures.`original` only visualizes the original dataset & annotations; `transformed` shows the resulting images processed through all the transforms; `pipeline` shows all the intermediate images. Defaults to "transformed". |
| -t, --task | `auto`, `textdet`, `textrecog` | Specify the task type of the dataset. If `auto`, the task type will be inferred from the config. If the script is unable to infer the task type, you need to specify it manually. Defaults to `auto`. |
| -n, --show-number | int | The number of samples to visualized. If not specified, display all images in the dataset. |
| -i, --show-interval | float | Interval of visualization (s), defaults to 2. |
| --cfg-options | float | Override configs.[Example](./config.md#command-line-modification) |
#### Examples
The following example demonstrates how to use the tool to visualize the training data used by the "DBNet_R50_icdar2015" model.
```Bash
# Example: Visualizing the training data used by dbnet_r50dcn_v2_fpnc_1200e_icadr2015 model
python tools/visualizations/browse_dataset.py configs/textdet/dbnet/dbnet_resnet50-dcnv2_fpnc_1200e_icdar2015.py
```
By default, the visualization mode is "transformed", and you will see the images & annotations being transformed by the pipeline:
If you just want to visualize the original dataset, simply set the mode to "original":
```Bash
python tools/visualizations/browse_dataset.py configs/textdet/dbnet/dbnet_resnet50-dcnv2_fpnc_1200e_icdar2015.py -m original
```
Or, to visualize the entire pipeline:
```Bash
python tools/visualizations/browse_dataset.py configs/textdet/dbnet/dbnet_resnet50-dcnv2_fpnc_1200e_icdar2015.py -m pipeline
```
In addition, users can also visualize the original images and their corresponding labels of the dataset by specifying the path to the dataset config file, for example:
```Bash
python tools/visualizations/browse_dataset.py configs/textrecog/_base_/datasets/icdar2015.py
```
Some datasets might have multiple variants. For example, the test split of `icdar2015` textrecog dataset has two variants, which the [base dataset config](/configs/textrecog/_base_/datasets/icdar2015.py) defines as follows:
```python
icdar2015_textrecog_test = dict(
ann_file='textrecog_test.json',
# ...
)
icdar2015_1811_textrecog_test = dict(
ann_file='textrecog_test_1811.json',
# ...
)
```
In this case, you can specify the variant name to visualize the corresponding dataset:
```Bash
python tools/visualizations/browse_dataset.py configs/textrecog/_base_/datasets/icdar2015.py -p icdar2015_1811_textrecog_test
```
Based on this tool, users can easily verify if the annotation of a custom dataset is correct.
### Hyper-parameter Scheduler Visualization
This tool aims to help the user to check the hyper-parameter scheduler of the optimizer (without training), which support the "learning rate" or "momentum"
#### Introduce the scheduler visualization tool
```bash
python tools/visualizations/vis_scheduler.py \
${CONFIG_FILE} \
[-p, --parameter ${PARAMETER_NAME}] \
[-d, --dataset-size ${DATASET_SIZE}] \
[-n, --ngpus ${NUM_GPUs}] \
[-s, --save-path ${SAVE_PATH}] \
[--title ${TITLE}] \
[--style ${STYLE}] \
[--window-size ${WINDOW_SIZE}] \
[--cfg-options]
```
**Description of all arguments**:
- `config`: The path of a model config file.
- **`-p, --parameter`**: The param to visualize its change curve, choose from "lr" and "momentum". Default to use "lr".
- **`-d, --dataset-size`**: The size of the datasets. If set,`build_dataset` will be skipped and `${DATASET_SIZE}` will be used as the size. Default to use the function `build_dataset`.
- **`-n, --ngpus`**: The number of GPUs used in training, default to be 1.
- **`-s, --save-path`**: The learning rate curve plot save path, default not to save.
- `--title`: Title of figure. If not set, default to be config file name.
- `--style`: Style of plt. If not set, default to be `whitegrid`.
- `--window-size`: The shape of the display window. If not specified, it will be set to `12*7`. If used, it must be in the format `'W*H'`.
- `--cfg-options`: Modifications to the configuration file, refer to [Learn about Configs](../user_guides/config.md).
```{note}
Loading annotations maybe consume much time, you can directly specify the size of the dataset with `-d, dataset-size` to save time.
```
#### How to plot the learning rate curve without training
You can use the following command to plot the step learning rate schedule used in the config `configs/textdet/dbnet/dbnet_resnet50-dcnv2_fpnc_1200e_icdar2015.py`:
```bash
python tools/visualizations/vis_scheduler.py configs/textdet/dbnet/dbnet_resnet50-dcnv2_fpnc_1200e_icdar2015.py -d 100
```
## Analysis Tools
### Offline Evaluation Tool
For saved prediction results, we provide an offline evaluation script `tools/analysis_tools/offline_eval.py`. The following example demonstrates how to use this tool to evaluate the output of the "PSENet" model offline.
```Bash
# When running the test script for the first time, you can save the output of the model by specifying the --save-preds parameter
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --save-preds
# Example: Testing on PSENet
python tools/test.py configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py epoch_600.pth --save-preds
# Then, using the saved outputs for offline evaluation
python tools/analysis_tool/offline_eval.py ${CONFIG_FILE} ${PRED_FILE}
# Example: Offline evaluation of saved PSENet results
python tools/analysis_tools/offline_eval.py configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py work_dirs/psenet_r50_fpnf_600e_icdar2015/epoch_600.pth_predictions.pkl
```
`-save-preds` saves the output to `work_dir/CONFIG_NAME/MODEL_NAME_predictions.pkl` by default
In addition, based on this tool, users can also convert predictions obtained from other libraries into MMOCR-supported formats, then use MMOCR's built-in metrics to evaluate them.
| ARGS | Type | Description |
| ------------- | ----- | ----------------------------------------------------------------- |
| config | str | (required) Path to the config. |
| pkl_results | str | (required) The saved predictions. |
| --cfg-options | float | Override configs.[Example](./config.md#command-line-modification) |
### Calculate FLOPs and the Number of Parameters
We provide a method to calculate the FLOPs and the number of parameters, first we install the dependencies using the following command.
```shell
pip install fvcore
```
The usage of the script to calculate FLOPs and the number of parameters is as follows.
```shell
python tools/analysis_tools/get_flops.py ${config} --shape ${IMAGE_SHAPE}
```
| ARGS | Type | Description |
| ------- | ---- | ----------------------------------------------------------------------------------------- |
| config | str | (required) Path to the config. |
| --shape | int | Image size to use when calculating FLOPs, such as `--shape 320 320`. Default is `640 640` |
For example, you can run the following command to get FLOPs and the number of parameters of `dbnet_resnet18_fpnc_100k_synthtext.py`:
```shell
python tools/analysis_tools/get_flops.py configs/textdet/dbnet/dbnet_resnet18_fpnc_100k_synthtext.py --shape 1024 1024
```
The output is as follows:
```shell
input shape is (1, 3, 1024, 1024)
| module | #parameters or shape | #flops |
| :------------------------ | :------------------- | :------ |
| model | 12.341M | 63.955G |
| backbone | 11.177M | 38.159G |
| backbone.conv1 | 9.408K | 2.466G |
| backbone.conv1.weight | (64, 3, 7, 7) | |
| backbone.bn1 | 0.128K | 83.886M |
| backbone.bn1.weight | (64,) | |
| backbone.bn1.bias | (64,) | |
| backbone.layer1 | 0.148M | 9.748G |
| backbone.layer1.0 | 73.984K | 4.874G |
| backbone.layer1.1 | 73.984K | 4.874G |
| backbone.layer2 | 0.526M | 8.642G |
| backbone.layer2.0 | 0.23M | 3.79G |
| backbone.layer2.1 | 0.295M | 4.853G |
| backbone.layer3 | 2.1M | 8.616G |
| backbone.layer3.0 | 0.919M | 3.774G |
| backbone.layer3.1 | 1.181M | 4.842G |
| backbone.layer4 | 8.394M | 8.603G |
| backbone.layer4.0 | 3.673M | 3.766G |
| backbone.layer4.1 | 4.721M | 4.837G |
| neck | 0.836M | 14.887G |
| neck.lateral_convs | 0.246M | 2.013G |
| neck.lateral_convs.0.conv | 16.384K | 1.074G |
| neck.lateral_convs.1.conv | 32.768K | 0.537G |
| neck.lateral_convs.2.conv | 65.536K | 0.268G |
| neck.lateral_convs.3.conv | 0.131M | 0.134G |
| neck.smooth_convs | 0.59M | 12.835G |
| neck.smooth_convs.0.conv | 0.147M | 9.664G |
| neck.smooth_convs.1.conv | 0.147M | 2.416G |
| neck.smooth_convs.2.conv | 0.147M | 0.604G |
| neck.smooth_convs.3.conv | 0.147M | 0.151G |
| det_head | 0.329M | 10.909G |
| det_head.binarize | 0.164M | 10.909G |
| det_head.binarize.0 | 0.147M | 9.664G |
| det_head.binarize.1 | 0.128K | 20.972M |
| det_head.binarize.3 | 16.448K | 1.074G |
| det_head.binarize.4 | 0.128K | 83.886M |
| det_head.binarize.6 | 0.257K | 67.109M |
| det_head.threshold | 0.164M | |
| det_head.threshold.0 | 0.147M | |
| det_head.threshold.1 | 0.128K | |
| det_head.threshold.3 | 16.448K | |
| det_head.threshold.4 | 0.128K | |
| det_head.threshold.6 | 0.257K | |
!!!Please be cautious if you use the results in papers. You may need to check if all ops are supported and verify that the flops computation is correct.
```