Training model with Custom Data
I want to train this model with custom data for the tables structures I have. The accuracy of default model is not at par. Any repo to get help on this.
Hi,
Refer to this notebook (be sure to replace the model and processor): https://github.com/NielsRogge/Transformers-Tutorials/blob/master/DETR/Fine_tuning_DetrForObjectDetection_on_custom_dataset_(balloon).ipynb
Hi @nielsr ,
I am facing an issue in downloading the data from Microsoft open dataset; Do you have any suggestions on the custom table structure data annotation/tagging or downloading the Microsoft open dataset PubTables-1M?
Reference:https://msropendata.com/datasets/505fcbe3-1383-42b1-913a-f651b8b712d3
Issue: Not able to log in to Microsoft open dataset.
Hi @nielsr ,
I have fine tuned the microsoft table detector using custom data using your approach and results are great , but when I tried to fine tune the microsoft table structure recognition with four classes table, column, row and header, results were very bad. Any suggestions how to fine tune the microsoft structure recognition model ?
@nielsr just to add to the above, I have a high quality dataset I used to try and fine-tune the model but it just weakened the model, making it worse than with pre-trained weights. I'm having a hard time figuring out if pretrained weights should be taken from DetrForObjectDetection or TableTransformerForObjectDetection when loading the pretrained model, what other params should be used, what is the purpose of using no_timm in the original Detr object detection fine-tuning example etc. I'm also not sure how to apply the difference of Table Transformer normalizing before MLP instead of after during training.
This worked (no errors during training), but gave bad resultsself.model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition")
This gave errors during training and bad resultsself.model = DetrForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition")
Hi @nielsr ,
I have fine tuned the microsoft table detector using custom data using your approach and results are great , but when I tried to fine tune the microsoft table structure recognition with four classes table, column, row and header, results were very bad. Any suggestions how to fine-tune the microsoft structure recognition model ?
Hey
@ankitom
Have you tried training the same on custom dataset using AutoModelForObjectDetection
?
https://huggingface.co/docs/transformers/tasks/object_detection#training-the-detr-model
I haven't tried it yet but I am going to do the same in coming weeks so I'll update on the same once done
Hey everyone. Does anyone have annotated image dataset for structure recognition model?
@nielsr just to add to the above, I have a high quality dataset I used to try and fine-tune the model but it just weakened the model, making it worse than with pre-trained weights. I'm having a hard time figuring out if pretrained weights should be taken from DetrForObjectDetection or TableTransformerForObjectDetection when loading the pretrained model, what other params should be used, what is the purpose of using no_timm in the original Detr object detection fine-tuning example etc. I'm also not sure how to apply the difference of Table Transformer normalizing before MLP instead of after during training.
This worked (no errors during training), but gave bad results
self.model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition")
This gave errors during training and bad resultsself.model = DetrForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition")
Hey @qooob can you please share your dataset of tables ?
@
Hi @nielsr ,
I have fine tuned the microsoft table detector using custom data using your approach and results are great , but when I tried to fine tune the microsoft table structure recognition with four classes table, column, row and header, results were very bad. Any suggestions how to fine-tune the microsoft structure recognition model ?
Hey @ankitom
Have you tried training the same on custom dataset usingAutoModelForObjectDetection
?
https://huggingface.co/docs/transformers/tasks/object_detection#training-the-detr-modelI haven't tried it yet but I am going to do the same in coming weeks so I'll update on the same once done
@pathikg Have you tried it yet ?
@ankitom have you figure out why your fine tuned model perform worst ? i have fine tuned table_structure_recognize model on my custom data set using following approach but my model performance was very bad, even a single object can't be detected. however the performance of the base model was average. could please suggest me how can i enhance the performance?
If anyone is still looking for a guide on data annotation/preparation and how to fine-tune the table transformer (either detection or structure recognition), I have prepared two articles on how to do it. One on data annotation/preparation and one on fine-tuning. I hope these helps and feel free to let me know if you have any questions. Big thanks and credits to @nielsr for his notebook on fine-tuning a DETR for object detection and his other notebooks on inference using Table Transformer.
@waterabbit114 that is very cool, however I see they are on Medium, any interest in publishing them at https://huggingface.co/blog/community?
Hi @nielsr , thanks for the suggestion! Medium is just a platform I always default to, and will always keep my articles free access.
I was more than happy to port my articles over to the community blog for wider reach and accessibility, but I realized that to publish an article requires a PRO or Enterprise Hub subscription which I do not have. Let me know if I have misinterpreted it otherwise.
Edit: I realized that there's this Blog-Explorers organization that I can request to join to publish community articles. I have requested access and will port over my articles if I manage to get access!