pierreguillou
commited on
Commit
·
9595e28
1
Parent(s):
a8bcd8f
Update README.md
Browse files
README.md
CHANGED
@@ -57,6 +57,11 @@ It achieves the following results on the evaluation set:
|
|
57 |
- F1: 0.8584
|
58 |
- Accuracy: 0.8584
|
59 |
|
|
|
|
|
|
|
|
|
|
|
60 |
### DocLayNet dataset
|
61 |
|
62 |
[DocLayNet dataset](https://github.com/DS4SD/DocLayNet) (IBM) provides page-by-page layout segmentation ground-truth using bounding-boxes for 11 distinct class labels on 80863 unique pages from 6 document categories.
|
@@ -75,11 +80,11 @@ At inference time, a calculation of best probabilities give the label to each li
|
|
75 |
|
76 |
## Inference
|
77 |
|
78 |
-
See notebook: [inference_on_LiLT_model_finetuned_on_DocLayNet_base_in_any_language_at_levellines_ml384.ipynb
|
79 |
|
80 |
## Training and evaluation data
|
81 |
|
82 |
-
See notebook: [Fine_tune_LiLT_on_DocLayNet_base_in_any_language_at_linelevel_ml_384.ipynb
|
83 |
|
84 |
## Training procedure
|
85 |
|
|
|
57 |
- F1: 0.8584
|
58 |
- Accuracy: 0.8584
|
59 |
|
60 |
+
**References:**
|
61 |
+
- Blog Post: [Document AI | Document Understanding model at line level with LiLT, Tesseract and DocLayNet dataset]()
|
62 |
+
- Notebook: [Document AI | Fine-tune LiLT on DocLayNet base in any language at line level (chunk of 384 tokens with overlap)](https://github.com/piegu/language-models/blob/master/Fine_tune_LiLT_on_DocLayNet_base_in_any_language_at_linelevel_ml_384.ipynb)
|
63 |
+
- Notebook: [Document AI | Inference at line level with a Document Understanding model (LiLT fine-tuned on DocLayNet dataset)](https://github.com/piegu/language-models/blob/master/inference_on_LiLT_model_finetuned_on_DocLayNet_base_in_any_language_at_levellines_ml384.ipynb)
|
64 |
+
|
65 |
### DocLayNet dataset
|
66 |
|
67 |
[DocLayNet dataset](https://github.com/DS4SD/DocLayNet) (IBM) provides page-by-page layout segmentation ground-truth using bounding-boxes for 11 distinct class labels on 80863 unique pages from 6 document categories.
|
|
|
80 |
|
81 |
## Inference
|
82 |
|
83 |
+
See notebook: [Document AI | Inference at line level with a Document Understanding model (LiLT fine-tuned on DocLayNet dataset)](https://github.com/piegu/language-models/blob/master/inference_on_LiLT_model_finetuned_on_DocLayNet_base_in_any_language_at_levellines_ml384.ipynb)
|
84 |
|
85 |
## Training and evaluation data
|
86 |
|
87 |
+
See notebook: [Document AI | Fine-tune LiLT on DocLayNet base in any language at line level (chunk of 384 tokens with overlap)](https://github.com/piegu/language-models/blob/master/Fine_tune_LiLT_on_DocLayNet_base_in_any_language_at_linelevel_ml_384.ipynb)
|
88 |
|
89 |
## Training procedure
|
90 |
|