RT-DETR-v2 r50vd model fine-tuned on about 11k Manga, Webtoon, Manhua and Western Comic style Images for text and speech bubble detection.
Training Image Size = 640. Training Images were resized, not cropped.
Tall Webtoons were split vertically.
Classes are:
0: bubble
1: text_bubble (text inside bubbles)
2: text_free (text outside bubbles)

Downloads last month
10,060
Safetensors
Model size
42.9M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support