scfengv commited on
Commit
b900f3a
·
1 Parent(s): fd3ed5b

Update README

Browse files
Files changed (1) hide show
  1. README.md +73 -57
README.md CHANGED
@@ -33,31 +33,86 @@ model-index:
33
  type: F1 score (Macro)
34
  value: 0.993694
35
  ---
36
- # Model Card for Model ID
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
- <!-- Provide a quick summary of what the model is/does. -->
39
 
40
- ## Model Details
 
 
 
 
 
 
 
 
41
 
42
- ### Model Description
 
 
43
 
44
- <!-- Provide a longer summary of what this model is. -->
 
 
45
 
 
 
 
46
 
47
- - **Developed by:** [scfengv](https://huggingface.co/scfengv)
48
- - **Model type:** BERT Multi-label Text Classification
49
- - **Language:** Chinese (Zh)
50
- - **Finetuned from model:** [google-bert/bert-base-chinese](https://huggingface.co/google-bert/bert-base-chinese)
51
 
52
- ### Model Sources
 
 
 
53
 
54
- <!-- Provide the basic links for the model. -->
 
55
 
56
- - **Repository:** [scfengv/NLP-Topic-Modeling-for-TVL-livestream-comments](https://github.com/scfengv/NLP-Topic-Modeling-for-TVL-livestream-comments)
57
 
58
- ## How to Get Started with the Model
 
 
 
 
59
 
60
- Use the code below to get started with the model.
61
 
62
  ```python
63
  import torch
@@ -79,47 +134,8 @@ with torch.no_grad():
79
  print(predictions)
80
  ```
81
 
82
- ## Training Details
83
-
84
- - **Hardware Type:** NVIDIA Quadro RTX8000
85
- - **Library:** PyTorch
86
- - **Hours used:** 2hr 13mins
87
-
88
- ### Training Data
89
-
90
- - [scfengv/TVL-game-layer-dataset](https://huggingface.co/datasets/scfengv/TVL-game-layer-dataset)
91
- - train
92
-
93
-
94
- ### Training Hyperparameters
95
-
96
- The model was trained using the following hyperparameters:
97
-
98
- ```
99
- Learning rate: 1e-05
100
- Batch size: 32
101
- Number of epochs: 10
102
- Optimizer: Adam
103
- Loss function: torch.nn.BCEWithLogitsLoss()
104
- ```
105
-
106
- ## Evaluation
107
-
108
- <!-- This section describes the evaluation protocols and provides the results. -->
109
-
110
- ### Testing Data, Factors & Metrics
111
-
112
- #### Testing Data
113
-
114
- - [scfengv/TVL-game-layer-dataset](https://huggingface.co/datasets/scfengv/TVL-game-layer-dataset)
115
- - validation
116
- - Remove Emoji
117
- - Emoji2Desc
118
- - Remove Punctuation
119
-
120
- ### Results (validation)
121
-
122
- - Accuracy score: 0.985764
123
- - F1 score (Micro): 0.993132
124
- - F1 Score (Macro): 0.993694
125
 
 
 
33
  type: F1 score (Macro)
34
  value: 0.993694
35
  ---
36
+ # Model Details of TVL_GameLayerClassifier
37
+
38
+ ## Base Model
39
+ This model is fine-tuned from [google-bert/bert-base-chinese](https://huggingface.co/google-bert/bert-base-chinese).
40
+
41
+ ## Model Architecture
42
+ - **Type**: BERT-based text classification model
43
+ - **Hidden Size**: 768
44
+ - **Number of Layers**: 12
45
+ - **Number of Attention Heads**: 12
46
+ - **Intermediate Size**: 3072
47
+ - **Max Sequence Length**: 512
48
+ - **Vocabulary Size**: 21,128
49
+
50
+ ## Key Components
51
+ 1. **Embeddings**
52
+ - Word Embeddings
53
+ - Position Embeddings
54
+ - Token Type Embeddings
55
+ - Layer Normalization
56
+
57
+ 2. **Encoder**
58
+ - 12 layers of:
59
+ - Self-Attention Mechanism
60
+ - Intermediate Dense Layer
61
+ - Output Dense Layer
62
+ - Layer Normalization
63
+
64
+ 3. **Pooler**
65
+ - Dense layer for sentence representation
66
+
67
+ 4. **Classifier**
68
+ - Output layer with 5 classes
69
+
70
+ ## Training Hyperparameters
71
 
72
+ The model was trained using the following hyperparameters:
73
 
74
+ ```
75
+ Learning rate: 1e-05
76
+ Batch size: 32
77
+ Number of epochs: 10
78
+ Optimizer: Adam
79
+ Loss function: torch.nn.BCEWithLogitsLoss()
80
+ ```
81
+
82
+ ## Training Infrastructure
83
 
84
+ - **Hardware Type:** NVIDIA Quadro RTX8000
85
+ - **Library:** PyTorch
86
+ - **Hours used:** 2hr 13mins
87
 
88
+ ## Model Parameters
89
+ - Total parameters: ~102M (estimated)
90
+ - All parameters are in 32-bit floating point (F32) format
91
 
92
+ ## Input Processing
93
+ - Uses BERT tokenization
94
+ - Supports sequences up to 512 tokens
95
 
96
+ ## Output
97
+ - 5-class multi-label classification
 
 
98
 
99
+ ## Performance Metrics
100
+ - Accuracy score: 0.985764
101
+ - F1 score (Micro): 0.993132
102
+ - F1 score (Macro): 0.993694
103
 
104
+ ## Training Dataset
105
+ This model was trained on the [scfengv/TVL-game-layer-dataset](https://huggingface.co/datasets/scfengv/TVL-game-layer-dataset).
106
 
107
+ ## Testing Dataset
108
 
109
+ - [scfengv/TVL-game-layer-dataset](https://huggingface.co/datasets/scfengv/TVL-game-layer-dataset)
110
+ - validation
111
+ - Remove Emoji
112
+ - Emoji2Desc
113
+ - Remove Punctuation
114
 
115
+ ## Usage
116
 
117
  ```python
118
  import torch
 
134
  print(predictions)
135
  ```
136
 
137
+ ## Additional Notes
138
+ - This model is specifically designed for TVL Game layer classification tasks.
139
+ - It's based on the Chinese BERT model, indicating it's optimized for Chinese text.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
 
141
+ For more detailed information about the model architecture or usage, please refer to the BERT documentation and the specific fine-tuning process used for this classifier.