dnnsdunca commited on
Commit
736ea32
·
verified ·
1 Parent(s): a117e96

Create readme.md

Browse files
Files changed (1) hide show
  1. readme.md +61 -0
readme.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ project: Multitask Learning for Agent-Action Identification
2
+
3
+ Project Overview
4
+ This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text.
5
+ Project Structure
6
+ The project is organized into the following directories and files:
7
+ dataset/: contains the custom dataset class for loading and processing the text data
8
+ dataset.py: defines the dataset class
9
+ data_collator.py: defines the data collator class
10
+ model/: contains the multitask learning model architecture
11
+ model.py: defines the model architecture
12
+ training/: contains the training loop and evaluation code
13
+ main.py: contains the training loop and evaluation code
14
+ data/: contains the dataset files for training, validation, and testing
15
+ train.csv: training dataset
16
+ val.csv: validation dataset
17
+ test.csv: testing dataset
18
+ requirements.txt: lists the dependencies required to run the project
19
+ Dataset
20
+ The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets.
21
+ Training Set: 80% of the dataset (10,000 examples)
22
+ Validation Set: 10% of the dataset (1,250 examples)
23
+ Testing Set: 10% of the dataset (1,250 examples)
24
+ Model
25
+ The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously.
26
+ Model Architecture:
27
+ BERT encoder
28
+ Two classification heads for agents and actions
29
+ Model Parameters:
30
+ BERT encoder: 110M parameters
31
+ Classification heads: 10M parameters
32
+ Training
33
+ The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py.
34
+ Training Hyperparameters:
35
+ Batch size: 16
36
+ Number of epochs: 3
37
+ Learning rate: 1e-5
38
+ Training Time: approximately 10 hours on a single NVIDIA V100 GPU
39
+ Evaluation
40
+ The model is evaluated on the validation set during training. The evaluation metric is accuracy.
41
+ Evaluation Metric: accuracy
42
+ Evaluation Frequency: every 500 steps
43
+ Requirements
44
+ The project requires the following dependencies:
45
+ Python: 3.8+
46
+ Transformers: 4.20.1+
47
+ Torch: 1.12.0+
48
+ Pandas: 1.4.2+
49
+ Usage
50
+ To train the model, run the following command:
51
+ Bash
52
+ python main.py
53
+ To evaluate the model, run the following command:
54
+ Bash
55
+ python main.py --mode eval
56
+ License
57
+ This project is licensed under the MIT License.
58
+ Acknowledgments
59
+ This project was inspired by the work of [Dennis Duncan].
60
+ Contributing
61
+ Contributions are welcome! Please open an issue or submit a pull request to contribute to the project.