sakana-yu commited on
Commit
e9aa2e1
1 Parent(s): 8a3b38e

chore: 刪除與 hg 不相容的 README 內容

Browse files
Files changed (1) hide show
  1. README.md +0 -209
README.md DELETED
@@ -1,209 +0,0 @@
1
- ---
2
- backbone:
3
- - OFA
4
- domain:
5
- - multi-modal
6
- frameworks:
7
- - pytorch
8
- license: Apache License 2.0
9
- metrics:
10
- - accuracy
11
- tags:
12
- - Alibaba
13
- - ICML2022
14
- - arxiv:2202.03052
15
- tasks:
16
- - ocr-recognition
17
-
18
- datasets:
19
- evaluation:
20
- - modelscope/ocr_fudanvi_zh
21
- train:
22
- - modelscope/ocr_fudanvi_zh
23
- finetune-support: True
24
- integrating: False
25
- widgets:
26
- - task: ofa-ocr-recognition
27
- inputs:
28
- - name: image
29
- title: 图片
30
- type: image
31
- validator:
32
- max_resolution: 5000*5000
33
- max_size: 10M
34
- examples:
35
- - name: 1
36
- title: 示例1
37
- inputs:
38
- - data: https://xingchen-data.oss-cn-zhangjiakou.aliyuncs.com/maas/ocr/ocr_general_demo.png
39
- name: image
40
- inferencespec:
41
- cpu: 4
42
- gpu: 1
43
- gpu_memory: 16000
44
- memory: 43000
45
- integrating: True
46
- ---
47
- # OFA-文字识别
48
- ## News
49
- - 2023年1月:
50
- - 优化了finetune流程,支持参数更新、自定义数据及脚本分布式训练等,见finetune示例。
51
- - 2022年11月:
52
- - 发布ModelScope 1.0版本,以下能力请使用1.0.2及以上版本。
53
- - 支持finetune能力,新增[OFA Tutorial](https://www.modelscope.cn/docs/OFA%20Tutorial),finetune能力参考1.4节。
54
-
55
-
56
- ## 文字识别是什么?
57
- 文字识别,即给定一张文本图片,识别出图中所含文字并输出对应字符串,欢迎使用!
58
-
59
-
60
- ## 快速玩起来
61
- 玩转OFA只需区区以下6行代码,就是如此轻松!如果你觉得还不够方便,请点击右上角`Notebook`按钮,我们为你提供了配备了GPU的环境,你只需要在notebook里输入提供的代码,就可以把OFA玩起来了!
62
-
63
- <p align="center">
64
- <img src="resources/ocr_general_demo.png" alt="ocr" width="200" />
65
-
66
- ```python
67
- from modelscope.pipelines import pipeline
68
- from modelscope.utils.constant import Tasks
69
- from modelscope.outputs import OutputKeys
70
-
71
- # ModelScope Library >= 1.2.0
72
- ocr_recognize = pipeline(Tasks.ocr_recognition, model='damo/ofa_ocr-recognition_general_base_zh', model_revision='v1.0.2')
73
- result = ocr_recognize('https://xingchen-data.oss-cn-zhangjiakou.aliyuncs.com/maas/ocr/ocr_general_demo.png')
74
- print(result[OutputKeys.TEXT])
75
- ```
76
- <br>
77
-
78
- ## OFA是什么?
79
- OFA(One-For-All)是通用多模态预训练模型,使用简单的序列到序列的学习框架统一模态(跨模态、视觉、语言等模态)和任务(如图片生成、视觉定位、图片描述、图片分类、文本生成等),详见我们发表于ICML 2022的论文:[OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework](https://arxiv.org/abs/2202.03052),以及我们的官方Github仓库[https://github.com/OFA-Sys/OFA](https://github.com/OFA-Sys/OFA)。
80
-
81
- <p align="center">
82
- <br>
83
- <img src="resources/OFA_logo_tp_path.svg" width="150" />
84
- <br>
85
- <p>
86
- <br>
87
-
88
- <p align="center">
89
- <a href="https://github.com/OFA-Sys/OFA">Github</a>&nbsp | &nbsp<a href="https://arxiv.org/abs/2202.03052">Paper </a>&nbsp | &nbspBlog
90
- </p>
91
-
92
- <p align="center">
93
- <br>
94
- <video src="https://xingchen-data.oss-cn-zhangjiakou.aliyuncs.com/maas/resources/modelscope_web/demo.mp4" loop="loop" autoplay="autoplay" muted width="100%"></video>
95
- <br>
96
- </p>
97
-
98
-
99
- ## 为什么OFA是文字识别的最佳选择?
100
- OFA在文字识别(ocr recognize)在公开数据集(including RCTW, ReCTS, LSVT, ArT, CTW)中进行评测, 在准确率指标上达到SOTA结果,具体如下:
101
- <p align="left">
102
- <table border="1" width="100%">
103
- <tr align="center">
104
- <td>Model</td><td>Scene</td><td>Web</td><td>Document</td><td>Handwriting</td><td>Avg</td>
105
- </tr>
106
- <tr align="center">
107
- <td>SAR</td><td>62.5</td><td>54.3</td><td>93.8</td><td>31.4</td><td>67.3</td>
108
- </tr>
109
- <tr align="center">
110
- <td>TransOCR</td><td>63.3</td><td>62.3</td><td>96.9</td><td>53.4</td><td>72.8</td>
111
- </tr>
112
- <tr align="center">
113
- <td>MaskOCR-base</td><td>73.9</td><td>74.8</td><td>99.3</td><td>63.7</td><td>80.8</td>
114
- </tr>
115
- <tr align="center">
116
- <td>OFA-OCR</td><td>82.9</td><td>81.7</td><td>99.1</td><td>69.0</td><td>86.0</td>
117
- </tr>
118
- </table>
119
- <br>
120
- </p>
121
-
122
- ## 模型训练流程
123
-
124
- ### 训练数据介绍
125
- 本模型训练数据集是复旦大学视觉智能实验室,数据链接:https://github.com/FudanVI/benchmarking-chinese-text-recognition
126
- 场景数据集图片采样:
127
- <p align="center">
128
- <img src="./resources/ocr_general.png" width="500" />
129
- </p>
130
-
131
- ### 训练流程
132
- 模型及finetune细节请参考[OFA Tutorial](https://modelscope.cn/docs/OFA_Tutorial#1.4%20%E5%A6%82%E4%BD%95%E8%AE%AD%E7%BB%83) 1.4节。
133
-
134
- ### Finetune示例
135
- ```python
136
- import tempfile
137
- from modelscope.msdatasets import MsDataset
138
- from modelscope.metainfo import Trainers
139
- from modelscope.trainers import build_trainer
140
- from modelscope.utils.constant import DownloadMode
141
-
142
- train_dataset = MsDataset(MsDataset.load(
143
- 'ocr_fudanvi_zh',
144
- subset_name='scene',
145
- namespace='modelscope',
146
- split='train[:100]',
147
- download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS).remap_columns({
148
- 'label': 'text'
149
- }))
150
-
151
- test_dataset = MsDataset(
152
- MsDataset.load(
153
- 'ocr_fudanvi_zh',
154
- subset_name='scene',
155
- namespace='modelscope',
156
- split='test[:20]',
157
- download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS).remap_columns({
158
- 'label': 'text'
159
- }))
160
-
161
- # 可以在代码修改 configuration 的配置
162
- def cfg_modify_fn(cfg):
163
- cfg.train.hooks = [{
164
- 'type': 'CheckpointHook',
165
- 'interval': 2
166
- }, {
167
- 'type': 'TextLoggerHook',
168
- 'interval': 1
169
- }, {
170
- 'type': 'IterTimerHook'
171
- }]
172
- cfg.train.max_epochs=2
173
- return cfg
174
-
175
- args = dict(
176
- model='damo/ofa_ocr-recognition_general_base_zh',
177
- model_revision='v1.0.2',
178
- train_dataset=train_dataset,
179
- eval_dataset=test_dataset,
180
- cfg_modify_fn=cfg_modify_fn,
181
- work_dir = tempfile.TemporaryDirectory().name)
182
- trainer = build_trainer(name=Trainers.ofa, default_args=args)
183
- trainer.train()
184
- ```
185
-
186
- ## 模型局限性以及可能的偏差
187
- 训练数据集自身有局限,有可能产生一些偏差,请用户自行评测后决定如何使用。
188
-
189
- ## 相关论文以及引用
190
- 如果你觉得OFA好用,喜欢我们的工作,欢迎引用:
191
- ```
192
- @article{wang2022ofa,
193
- author = {Peng Wang and
194
- An Yang and
195
- Rui Men and
196
- Junyang Lin and
197
- Shuai Bai and
198
- Zhikang Li and
199
- Jianxin Ma and
200
- Chang Zhou and
201
- Jingren Zhou and
202
- Hongxia Yang},
203
- title = {OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence
204
- Learning Framework},
205
- journal = {CoRR},
206
- volume = {abs/2202.03052},
207
- year = {2022}
208
- }
209
- ```