unb-lamfo-nlp-mcti
/

NLP-Classification-MCTI

Model card Files Files and versions Community

MarcosDib commited on Dec 15, 2022

Commit

6270a7a

•

1 Parent(s): f7c57c4

Update README.md

Files changed (1) hide show

README.md +16 -19

README.md CHANGED Viewed

@@ -38,25 +38,22 @@ Transformer-based approach, the Word2Vec-based approach improved the accuracy ra
 ## Model description
-After the embedding, which is just essentially data preprocessing, it is necessary to develop the Project
-further to analyze the input text and classify whether it is a valid research funding opportunity for
-Brazilian or not.
-For the project, the best option would be chosen empirically upon comparing the results of 4 distinct architectures:
-Neural Network (NN), Deep Neural Network (DNN), Long Short-Term Memory (LSTM), and Convolutional Neural Network (CNN).
-Figure 4 shows the structure of the models.
-A neural network (NN) here is a simple feedforward neural network with only a single hidden layer, usually called
-”shallow”. Shallow NNs are often limited in the complexity of the problems they can be trained to solve well.
-Our CNN model uses a dropout layer feeding into a couple of Conv1D layers and then a MaxPooling layer. After that,
-we Figure 4: Classification models use a hidden layer composed of a dense layer of size 128, followed by another
-dropout layer, and finally, the Flatten and final dense classification layer.
-The architecture of the CNN network used is composed of a 50% dropout layer followed by two 1D convolution
-layers associated with a MaxPooling layer. After max pooling a dense layer of size 128 was added connected
-to a 50% dropout which finally connects to a flatten layer and the final classification dense layer. Dropout
-layers help to avoid overfitting the network by masking part of the data so that the network learns to create
 redundancies in the analysis of the inputs.
 ![CNN Classification Model](https://raw.githubusercontent.com/chap0lin/WEBIST2022/master/Assets/cnn_model.png)

 ## Model description
+The work consists of a machine learning model with word embedding and Convolutional Neural Network (CNN).
+For the project, a Convolutional Neural Network (CNN) was chosen, as it presents better accuracy in empirical
+comparison with 3 other different architectures: Neural Network (NN), Deep Neural Network (DNN) and Long-Term
+Memory (LSTM).
+As the input data is compose of unstructured and nonuniform texts it is essential normalize the data to study
+little insights and valuable relationships to work with the best features of them. In this way, learning is
+facilitated and allows the gradient descent to converge more quickly.
+The first layer of the model is an embedding layer as a method of extracting features from the data that can
+replace one-hot coding with dimensional reduction.
+The architecture of the CNN network is composed of a 50% dropout layer followed by two 1D convolution layers
+associated with a MaxPooling layer. After maximum grouping, a dense layer of size 128 is added connected to
+a 50% dropout which finally connects to a flattened layer and the final sort dense layer. The dropout layers
+helped to avoid network overfitting by masking part of the data so that the network learned to create
 redundancies in the analysis of the inputs.
 ![CNN Classification Model](https://raw.githubusercontent.com/chap0lin/WEBIST2022/master/Assets/cnn_model.png)