priyank-m
/

m_OCR

Image-Text-to-Text

vision-encoder-decoder

Image-Captioning

Text-Recognition

Inference Endpoints

Model card Files Files and versions Community

priyank-m commited on Dec 21, 2022

Commit

9a6e07e

·

1 Parent(s): 7fdcd7b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -34,7 +34,7 @@ Notes and observations:
 11. Using set_transform function to transform the samples on-the-fly was a good idea as it didn't require to save the transformed dataset.
 12. Streaming dataset might be another good option if the dataset size were to increase any further.
 13. Free GPU on colab seem not enough for this experiment, as keeping two models in GPU and training forces to keep batch size small and also the free GPUs (T4) are not fast enough.
-14. A very important data cleaning step was to just check if the sample image and text can be converted to the input format expected by the model, the text should be non-empty value when converted back from the input IDs to text (some characters are not identified by the tokenizer and get converted to special token and we usually skip the special tokens when converting to text) as it is required to be non-empty while doing the CER calculation.

 11. Using set_transform function to transform the samples on-the-fly was a good idea as it didn't require to save the transformed dataset.
 12. Streaming dataset might be another good option if the dataset size were to increase any further.
 13. Free GPU on colab seem not enough for this experiment, as keeping two models in GPU and training forces to keep batch size small and also the free GPUs (T4) are not fast enough.
+14. A very important data cleaning step was to just check if the sample image and text can be converted to the input format expected by the model, the text should be non-empty value when converted back from the input IDs to text (some characters are not identified by the tokenizer and get converted to special token and we usually skip the special tokens when converting input IDs back to text) as it is required to be non-empty while doing the CER calculation.