Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.9.1
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Description
This is an implementation of ABCNet based on MMOCR, MMCV, and MMEngine.
ABCNet is a conceptually novel, efficient, and fully convolutional framework for text spotting, which address the problem by proposing the Adaptive Bezier-Curve Network (ABCNet). Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance with arbitrary shapes, significantly improving the precision compared with previous methods. 3) Compared with standard bounding box detection, our Bezier curve detection introduces negligible computation overhead, resulting in superiority of our method in both efficiency and accuracy. Experiments on arbitrarily-shaped benchmark datasets, namely Total-Text and CTW1500, demonstrate that ABCNet achieves state-of-the-art accuracy, meanwhile significantly improving the speed. In particular, on Total-Text, our realtime version is over 10 times faster than recent state-of-the-art methods with a competitive recognition accuracy.
Usage
Prerequisites
All the commands below rely on the correct configuration of PYTHONPATH
, which should point to the project's directory so that Python can locate the module files. In ABCNet/
root directory, run the following line to add the current directory to PYTHONPATH
:
# Linux
export PYTHONPATH=`pwd`:$PYTHONPATH
# Windows PowerShell
$env:PYTHONPATH=Get-Location
if the data is not in ABCNet/
, you can link the data into ABCNet/
:
# Linux
ln -s ${DataPath} $PYTHONPATH
# Windows PowerShell
New-Item -ItemType SymbolicLink -Path $env:PYTHONPATH -Name data -Target ${DataPath}
Training commands
In the current directory, run the following command to train the model:
mim train mmocr config/abcnet/abcnet_resnet50_fpn_500e_icdar2015.py --work-dir work_dirs/
To train on multiple GPUs, e.g. 8 GPUs, run the following command:
mim train mmocr config/abcnet/abcnet_resnet50_fpn_500e_icdar2015.py --work-dir work_dirs/ --launcher pytorch --gpus 8
Testing commands
In the current directory, run the following command to test the model:
mim test mmocr config/abcnet/abcnet_resnet50_fpn_500e_icdar2015.py --work-dir work_dirs/ --checkpoint ${CHECKPOINT_PATH}
Results
Here we provide the baseline version of ABCNet with ResNet50 backbone.
To find more variants, please visit the official model zoo.
Name | Pretrained Model | E2E-None-Hmean | det-Hmean | Download |
---|---|---|---|---|
v1-icdar2015-finetune | SynthText | 0.6127 | 0.8753 | model | log |
Citation
If you find ABCNet useful in your research or applications, please cite ABCNet with the following BibTeX entry.
@inproceedings{liu2020abcnet,
title = {{ABCNet}: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network},
author = {Liu, Yuliang and Chen, Hao and Shen, Chunhua and He, Tong and Jin, Lianwen and Wang, Liangwei},
booktitle = {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}
Checklist
Milestone 1: PR-ready, and acceptable to be one of the
projects/
.Finish the code
Basic docstrings & proper citation
Test-time correctness
A full README
Milestone 2: Indicates a successful model implementation.
Training-time correctness
Milestone 3: Good to be a part of our core package!
Type hints and docstrings
Unit tests
Code polishing
Metafile.yml
Move your modules into the core package following the codebase's file hierarchy structure.
Refactor your modules into the core package following the codebase's file hierarchy structure.