MarcosDib commited on
Commit
6c974c9
1 Parent(s): 1e714b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -82,7 +82,7 @@ Other 24 smaller models are released afterward.
82
 
83
  The detailed release history can be found on the [here](https://huggingface.co/unb-lamfo-nlp-mcti) on github.
84
 
85
- #### Table 1 :
86
  | Model | #params | Language |
87
  |------------------------------|:-------:|:--------:|
88
  | [`mcti-base-uncased`] | 110M | English |
@@ -91,6 +91,7 @@ The detailed release history can be found on the [here](https://huggingface.co/u
91
  | [`mcti-large-cased`] | 110M | Chinese |
92
  | [`-base-multilingual-cased`] | 110M | Multiple |
93
 
 
94
  | Dataset | Compatibility to base* |
95
  |--------------------------------------|:----------------------:|
96
  | Labeled MCTI | 100% |
@@ -208,6 +209,7 @@ to implement the [pre-processing code](https://github.com/mcti-sefip/mcti-sefip-
208
 
209
  Several Python packages were used to develop the preprocessing code:
210
 
 
211
  | Objective | Package |
212
  |--------------------------------------------------------|--------------|
213
  | Resolve contractions and slang usage in text | [contractions](https://pypi.org/project/contractions) |
@@ -224,7 +226,7 @@ Several Python packages were used to develop the preprocessing code:
224
  As detailed in the notebook on [GitHub](https://github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/pre-processamento/Pre_Processamento/MCTI_PPF_Pr%C3%A9_processamento), in the pre-processing, code was created to build and evaluate 8 (eight) different
225
  bases, derived from the base of goal 4, with the application of the methods shown in Figure 2.
226
 
227
- Table 4: Preprocessing methods evaluated
228
  | id | Experiments |
229
  |--------|------------------------------------------------------------------------|
230
  | Base | Original Texts |
@@ -239,8 +241,7 @@ Table 4: Preprocessing methods evaluated
239
 
240
 
241
 
242
-
243
- Table 5: Results obtained in Preprocessing
244
  | id | Experiment | acurácia | f1-score | recall | precision | Média(s) | N_tokens | max_lenght |
245
  |--------|------------------------------------------------------------------------|----------|----------|--------|-----------|----------|----------|------------|
246
  | Base | Original Texts | 89,78% | 84,20% | 79,09% | 90,95% | 417,772 | 23788 | 5636 |
@@ -271,8 +272,7 @@ data in a supervised manner. The new coupled model can be seen in Figure 5 under
271
  obtained results with related metrics. With this implementation, we achieved new levels of accuracy with 86% for the CNN
272
  architecture and 88% for the LSTM architecture.
273
 
274
-
275
- Table 6: Results from Pre-trained WE + ML models
276
  | ML Model | Accuracy | F1 Score | Precision | Recall |
277
  |:--------:|:---------:|:---------:|:---------:|:---------:|
278
  | NN | 0.8269 | 0.8545 | 0.8392 | 0.8712 |
 
82
 
83
  The detailed release history can be found on the [here](https://huggingface.co/unb-lamfo-nlp-mcti) on github.
84
 
85
+ #### Table 1:
86
  | Model | #params | Language |
87
  |------------------------------|:-------:|:--------:|
88
  | [`mcti-base-uncased`] | 110M | English |
 
91
  | [`mcti-large-cased`] | 110M | Chinese |
92
  | [`-base-multilingual-cased`] | 110M | Multiple |
93
 
94
+ #### Table 2:
95
  | Dataset | Compatibility to base* |
96
  |--------------------------------------|:----------------------:|
97
  | Labeled MCTI | 100% |
 
209
 
210
  Several Python packages were used to develop the preprocessing code:
211
 
212
+ #### Table 3: Python packages used
213
  | Objective | Package |
214
  |--------------------------------------------------------|--------------|
215
  | Resolve contractions and slang usage in text | [contractions](https://pypi.org/project/contractions) |
 
226
  As detailed in the notebook on [GitHub](https://github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/pre-processamento/Pre_Processamento/MCTI_PPF_Pr%C3%A9_processamento), in the pre-processing, code was created to build and evaluate 8 (eight) different
227
  bases, derived from the base of goal 4, with the application of the methods shown in Figure 2.
228
 
229
+ #### Table 4: Preprocessing methods evaluated
230
  | id | Experiments |
231
  |--------|------------------------------------------------------------------------|
232
  | Base | Original Texts |
 
241
 
242
 
243
 
244
+ #### Table 5: Results obtained in Preprocessing
 
245
  | id | Experiment | acurácia | f1-score | recall | precision | Média(s) | N_tokens | max_lenght |
246
  |--------|------------------------------------------------------------------------|----------|----------|--------|-----------|----------|----------|------------|
247
  | Base | Original Texts | 89,78% | 84,20% | 79,09% | 90,95% | 417,772 | 23788 | 5636 |
 
272
  obtained results with related metrics. With this implementation, we achieved new levels of accuracy with 86% for the CNN
273
  architecture and 88% for the LSTM architecture.
274
 
275
+ #### Table 6: Results from Pre-trained WE + ML models
 
276
  | ML Model | Accuracy | F1 Score | Precision | Recall |
277
  |:--------:|:---------:|:---------:|:---------:|:---------:|
278
  | NN | 0.8269 | 0.8545 | 0.8392 | 0.8712 |