sunnychenxiwang's picture
Upload 1600 files
14c9181 verified

A newer version of the Gradio SDK is available: 5.9.1

Upgrade

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Description

This is an implementation of ABCNet based on MMOCR, MMCV, and MMEngine.

ABCNet is a conceptually novel, efficient, and fully convolutional framework for text spotting, which address the problem by proposing the Adaptive Bezier-Curve Network (ABCNet). Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance with arbitrary shapes, significantly improving the precision compared with previous methods. 3) Compared with standard bounding box detection, our Bezier curve detection introduces negligible computation overhead, resulting in superiority of our method in both efficiency and accuracy. Experiments on arbitrarily-shaped benchmark datasets, namely Total-Text and CTW1500, demonstrate that ABCNet achieves state-of-the-art accuracy, meanwhile significantly improving the speed. In particular, on Total-Text, our realtime version is over 10 times faster than recent state-of-the-art methods with a competitive recognition accuracy.

Usage

Prerequisites

  • Python 3.7
  • PyTorch 1.6 or higher
  • MIM
  • MMOCR

All the commands below rely on the correct configuration of PYTHONPATH, which should point to the project's directory so that Python can locate the module files. In ABCNet/ root directory, run the following line to add the current directory to PYTHONPATH:

# Linux
export PYTHONPATH=`pwd`:$PYTHONPATH
# Windows PowerShell
$env:PYTHONPATH=Get-Location

if the data is not in ABCNet/, you can link the data into ABCNet/:

# Linux
ln -s ${DataPath} $PYTHONPATH
# Windows PowerShell
New-Item -ItemType SymbolicLink -Path $env:PYTHONPATH -Name data  -Target ${DataPath}

Training commands

In the current directory, run the following command to train the model:

mim train mmocr config/abcnet/abcnet_resnet50_fpn_500e_icdar2015.py --work-dir work_dirs/

To train on multiple GPUs, e.g. 8 GPUs, run the following command:

mim train mmocr config/abcnet/abcnet_resnet50_fpn_500e_icdar2015.py --work-dir work_dirs/ --launcher pytorch --gpus 8

Testing commands

In the current directory, run the following command to test the model:

mim test mmocr config/abcnet/abcnet_resnet50_fpn_500e_icdar2015.py --work-dir work_dirs/ --checkpoint ${CHECKPOINT_PATH}

Results

Here we provide the baseline version of ABCNet with ResNet50 backbone.

To find more variants, please visit the official model zoo.

Name Pretrained Model E2E-None-Hmean det-Hmean Download
v1-icdar2015-finetune SynthText 0.6127 0.8753 model | log

Citation

If you find ABCNet useful in your research or applications, please cite ABCNet with the following BibTeX entry.

@inproceedings{liu2020abcnet,
  title     =  {{ABCNet}: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network},
  author    =  {Liu, Yuliang and Chen, Hao and Shen, Chunhua and He, Tong and Jin, Lianwen and Wang, Liangwei},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =  {2020}
}

Checklist

  • Milestone 1: PR-ready, and acceptable to be one of the projects/.

    • Finish the code

    • Basic docstrings & proper citation

    • Test-time correctness

    • A full README

  • Milestone 2: Indicates a successful model implementation.

    • Training-time correctness

  • Milestone 3: Good to be a part of our core package!

    • Type hints and docstrings

    • Unit tests

    • Code polishing

    • Metafile.yml

  • Move your modules into the core package following the codebase's file hierarchy structure.

  • Refactor your modules into the core package following the codebase's file hierarchy structure.