dieineb commited on
Commit
47d9d7a
1 Parent(s): 58fabde

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -6
README.md CHANGED
@@ -3,6 +3,9 @@ library_name: keras
3
  tags:
4
  - translation
5
  license: apache-2.0
 
 
 
6
  ---
7
  # GRU-eng-por
8
 
@@ -154,19 +157,49 @@ Portuguese translation:
154
  [start] não faça isso [end]
155
  --------------------------------------------------
156
  ```
 
157
 
 
 
158
 
159
- # Cite as 🤗
160
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
  @misc{teenytinycastle,
162
  doi = {10.5281/zenodo.7112065},
163
- url = {https://huggingface.co/AiresPucrs/GRU-eng-por},
164
  author = {Nicholas Kluge Corr{\^e}a},
165
  title = {Teeny-Tiny Castle},
166
- year = {2023},
167
- publisher = {HuggingFace},
168
- journal = {HuggingFace repository},
169
  }
170
  ```
 
171
  ## License
172
  The GRU-eng-por is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.
 
3
  tags:
4
  - translation
5
  license: apache-2.0
6
+ language:
7
+ - en
8
+ - pt
9
  ---
10
  # GRU-eng-por
11
 
 
157
  [start] não faça isso [end]
158
  --------------------------------------------------
159
  ```
160
+ ## Intended Use
161
 
162
+ This model was created for research purposes only. Specifically, it was designed to translate sentences from English to Portuguese.
163
+ We do not recommend any application of this model outside this scope.
164
 
165
+ ## Performance Metrics
166
+
167
+ Accuracy is a crude way to monitor validation-set performance during this task.
168
+ On average, this model correctly predicts words in the Portuguese sentence: 65%.
169
+ However, next-token accuracy isn't an excellent metric for machine translation models.
170
+ During inference, you're generating the target sentence from scratch and can't rely on previously generated tokens (a.k.a. 100% correctness does not mean you have a good translator).
171
+ We would likely use "_BLEU scores_" in real-world machine translation applications to evaluate our models.
172
+
173
+ ## Training Data
174
+
175
+ [English-portuguese translation](https://www.kaggle.com/datasets/nageshsingh/englishportuguese-translation).
176
+
177
+ The dataset consists of a set of English and Portuguese sentences.
178
+
179
+
180
+ ## Limitations
181
+
182
+ Translations are far from perfect. To improve this model, we could:
183
+
184
+ 1. Use a deep stack of recurrent layers for both the encoder and the decoder.
185
+ 2. Or, we could use an `LSTM` instead of a `GRU`.
186
+
187
+ In conclusion, we do not recommend using this model in real-world applications.
188
+ It was solely developed for academic and educational purposes.
189
+
190
+ ## Cite as 🤗
191
+
192
+ ```latex
193
  @misc{teenytinycastle,
194
  doi = {10.5281/zenodo.7112065},
195
+ url = {https://github.com/Nkluge-correa/teeny-tiny_castle},
196
  author = {Nicholas Kluge Corr{\^e}a},
197
  title = {Teeny-Tiny Castle},
198
+ year = {2024},
199
+ publisher = {GitHub},
200
+ journal = {GitHub repository},
201
  }
202
  ```
203
+
204
  ## License
205
  The GRU-eng-por is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.