unb-lamfo-nlp-mcti
/

NLP-Classification-MCTI

English

Clsssification

science

Model card Files Files and versions Community

MarcosDib commited on Dec 12, 2022

Commit

6c974c9

•

1 Parent(s): 1e714b1

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -82,7 +82,7 @@ Other 24 smaller models are released afterward.
 The detailed release history can be found on the [here](https://huggingface.co/unb-lamfo-nlp-mcti) on github.
-#### Table 1 :
 | Model | #params | Language |
 |------------------------------|:-------:|:--------:|
 | [`mcti-base-uncased`] | 110M | English |
@@ -91,6 +91,7 @@ The detailed release history can be found on the [here](https://huggingface.co/u
 | [`mcti-large-cased`] | 110M | Chinese |
 | [`-base-multilingual-cased`] | 110M | Multiple |
 | Dataset | Compatibility to base* |
 |--------------------------------------|:----------------------:|
 | Labeled MCTI | 100% |
@@ -208,6 +209,7 @@ to implement the [pre-processing code](https://github.com/mcti-sefip/mcti-sefip-
 Several Python packages were used to develop the preprocessing code:
 | Objective | Package |
 |--------------------------------------------------------|--------------|
 | Resolve contractions and slang usage in text | [contractions](https://pypi.org/project/contractions) |
@@ -224,7 +226,7 @@ Several Python packages were used to develop the preprocessing code:
 As detailed in the notebook on [GitHub](https://github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/pre-processamento/Pre_Processamento/MCTI_PPF_Pr%C3%A9_processamento), in the pre-processing, code was created to build and evaluate 8 (eight) different
 bases, derived from the base of goal 4, with the application of the methods shown in Figure 2.
-Table 4: Preprocessing methods evaluated
 | id | Experiments |
 |--------|------------------------------------------------------------------------|
 | Base | Original Texts |
@@ -239,8 +241,7 @@ Table 4: Preprocessing methods evaluated
-Table 5: Results obtained in Preprocessing
 | id | Experiment | acurácia | f1-score | recall | precision | Média(s) | N_tokens | max_lenght |
 |--------|------------------------------------------------------------------------|----------|----------|--------|-----------|----------|----------|------------|
 | Base | Original Texts | 89,78% | 84,20% | 79,09% | 90,95% | 417,772 | 23788 | 5636 |
@@ -271,8 +272,7 @@ data in a supervised manner. The new coupled model can be seen in Figure 5 under
 obtained results with related metrics. With this implementation, we achieved new levels of accuracy with 86% for the CNN
 architecture and 88% for the LSTM architecture.
-Table 6: Results from Pre-trained WE + ML models
 | ML Model | Accuracy | F1 Score | Precision | Recall |
 |:--------:|:---------:|:---------:|:---------:|:---------:|
 | NN | 0.8269 | 0.8545 | 0.8392 | 0.8712 |

 The detailed release history can be found on the [here](https://huggingface.co/unb-lamfo-nlp-mcti) on github.
+#### Table 1:
 | Model | #params | Language |
 |------------------------------|:-------:|:--------:|
 | [`mcti-base-uncased`] | 110M | English |
 | [`mcti-large-cased`] | 110M | Chinese |
 | [`-base-multilingual-cased`] | 110M | Multiple |
+#### Table 2:
 | Dataset | Compatibility to base* |
 |--------------------------------------|:----------------------:|
 | Labeled MCTI | 100% |
 Several Python packages were used to develop the preprocessing code:
+#### Table 3: Python packages used
 | Objective | Package |
 |--------------------------------------------------------|--------------|
 | Resolve contractions and slang usage in text | [contractions](https://pypi.org/project/contractions) |
 As detailed in the notebook on [GitHub](https://github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/pre-processamento/Pre_Processamento/MCTI_PPF_Pr%C3%A9_processamento), in the pre-processing, code was created to build and evaluate 8 (eight) different
 bases, derived from the base of goal 4, with the application of the methods shown in Figure 2.
+#### Table 4: Preprocessing methods evaluated
 | id | Experiments |
 |--------|------------------------------------------------------------------------|
 | Base | Original Texts |
+#### Table 5: Results obtained in Preprocessing
 | id | Experiment | acurácia | f1-score | recall | precision | Média(s) | N_tokens | max_lenght |
 |--------|------------------------------------------------------------------------|----------|----------|--------|-----------|----------|----------|------------|
 | Base | Original Texts | 89,78% | 84,20% | 79,09% | 90,95% | 417,772 | 23788 | 5636 |
 obtained results with related metrics. With this implementation, we achieved new levels of accuracy with 86% for the CNN
 architecture and 88% for the LSTM architecture.
+#### Table 6: Results from Pre-trained WE + ML models
 | ML Model | Accuracy | F1 Score | Precision | Recall |
 |:--------:|:---------:|:---------:|:---------:|:---------:|
 | NN | 0.8269 | 0.8545 | 0.8392 | 0.8712 |