model documentation

#2
by nazneen - opened
Files changed (1) hide show
  1. README.md +479 -19
README.md CHANGED
@@ -11,17 +11,417 @@ datasets:
11
  - universal_dependencies
12
  metrics:
13
  - accuracy
14
-
15
  model-index:
16
  - name: xlm-roberta-base-ft-udpos28-tr
17
  results:
18
- - task:
19
  type: token-classification
20
  name: Part-of-Speech Tagging
21
  dataset:
22
- type: universal_dependencies
23
  name: Universal Dependencies v2.8
 
24
  metrics:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - type: accuracy
26
  name: English Test accuracy
27
  value: 74.4
@@ -313,14 +713,12 @@ model-index:
313
  - type: accuracy
314
  name: Belarusian Test accuracy
315
  value: 76.9
316
- - type: accuracy
317
- name: Serbian Test accuracy
318
  value: 72.2
319
- - type: accuracy
320
- name: Moksha Test accuracy
321
  value: 50.0
322
- - type: accuracy
323
- name: Western Armenian Test accuracy
324
  value: 70.5
325
  - type: accuracy
326
  name: Scottish Gaelic Test accuracy
@@ -337,20 +735,82 @@ model-index:
337
  - type: accuracy
338
  name: Chukchi Test accuracy
339
  value: 40.8
340
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
341
 
342
- # XLM-RoBERTa base Universal Dependencies v2.8 POS tagging: Turkish
343
-
344
- This model is part of our paper called:
345
-
346
- - Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages
347
-
348
- Check the [Space](https://huggingface.co/spaces/wietsedv/xpos) for more details.
349
-
350
- ## Usage
351
  ```python
352
  from transformers import AutoTokenizer, AutoModelForTokenClassification
353
 
354
  tokenizer = AutoTokenizer.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-tr")
 
355
  model = AutoModelForTokenClassification.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-tr")
 
356
  ```
 
 
 
 
 
11
  - universal_dependencies
12
  metrics:
13
  - accuracy
 
14
  model-index:
15
  - name: xlm-roberta-base-ft-udpos28-tr
16
  results:
17
+ - task:
18
  type: token-classification
19
  name: Part-of-Speech Tagging
20
  dataset:
 
21
  name: Universal Dependencies v2.8
22
+ type: universal_dependencies
23
  metrics:
24
+ - type: accuracy
25
+ value: 74.4
26
+ name: English Test accuracy
27
+ - type: accuracy
28
+ value: 73.7
29
+ name: Dutch Test accuracy
30
+ - type: accuracy
31
+ value: 73.5
32
+ name: German Test accuracy
33
+ - type: accuracy
34
+ value: 73.2
35
+ name: Italian Test accuracy
36
+ - type: accuracy
37
+ value: 71.4
38
+ name: French Test accuracy
39
+ - type: accuracy
40
+ value: 71.1
41
+ name: Spanish Test accuracy
42
+ - type: accuracy
43
+ value: 77.9
44
+ name: Russian Test accuracy
45
+ - type: accuracy
46
+ value: 74.5
47
+ name: Swedish Test accuracy
48
+ - type: accuracy
49
+ value: 69.2
50
+ name: Norwegian Test accuracy
51
+ - type: accuracy
52
+ value: 73.8
53
+ name: Danish Test accuracy
54
+ - type: accuracy
55
+ value: 45.8
56
+ name: Low Saxon Test accuracy
57
+ - type: accuracy
58
+ value: 39.8
59
+ name: Akkadian Test accuracy
60
+ - type: accuracy
61
+ value: 80.9
62
+ name: Armenian Test accuracy
63
+ - type: accuracy
64
+ value: 62.9
65
+ name: Welsh Test accuracy
66
+ - type: accuracy
67
+ value: 63.7
68
+ name: Old East Slavic Test accuracy
69
+ - type: accuracy
70
+ value: 71.5
71
+ name: Albanian Test accuracy
72
+ - type: accuracy
73
+ value: 62.3
74
+ name: Slovenian Test accuracy
75
+ - type: accuracy
76
+ value: 41.3
77
+ name: Guajajara Test accuracy
78
+ - type: accuracy
79
+ value: 68.0
80
+ name: Kurmanji Test accuracy
81
+ - type: accuracy
82
+ value: 88.4
83
+ name: Turkish Test accuracy
84
+ - type: accuracy
85
+ value: 81.1
86
+ name: Finnish Test accuracy
87
+ - type: accuracy
88
+ value: 71.5
89
+ name: Indonesian Test accuracy
90
+ - type: accuracy
91
+ value: 76.8
92
+ name: Ukrainian Test accuracy
93
+ - type: accuracy
94
+ value: 74.3
95
+ name: Polish Test accuracy
96
+ - type: accuracy
97
+ value: 76.7
98
+ name: Portuguese Test accuracy
99
+ - type: accuracy
100
+ value: 81.1
101
+ name: Kazakh Test accuracy
102
+ - type: accuracy
103
+ value: 68.2
104
+ name: Latin Test accuracy
105
+ - type: accuracy
106
+ value: 47.5
107
+ name: Old French Test accuracy
108
+ - type: accuracy
109
+ value: 62.6
110
+ name: Buryat Test accuracy
111
+ - type: accuracy
112
+ value: 24.6
113
+ name: Kaapor Test accuracy
114
+ - type: accuracy
115
+ value: 63.7
116
+ name: Korean Test accuracy
117
+ - type: accuracy
118
+ value: 82.0
119
+ name: Estonian Test accuracy
120
+ - type: accuracy
121
+ value: 72.3
122
+ name: Croatian Test accuracy
123
+ - type: accuracy
124
+ value: 24.1
125
+ name: Gothic Test accuracy
126
+ - type: accuracy
127
+ value: 41.1
128
+ name: Swiss German Test accuracy
129
+ - type: accuracy
130
+ value: 23.0
131
+ name: Assyrian Test accuracy
132
+ - type: accuracy
133
+ value: 45.2
134
+ name: North Sami Test accuracy
135
+ - type: accuracy
136
+ value: 36.0
137
+ name: Naija Test accuracy
138
+ - type: accuracy
139
+ value: 80.0
140
+ name: Latvian Test accuracy
141
+ - type: accuracy
142
+ value: 55.9
143
+ name: Chinese Test accuracy
144
+ - type: accuracy
145
+ value: 56.2
146
+ name: Tagalog Test accuracy
147
+ - type: accuracy
148
+ value: 30.0
149
+ name: Bambara Test accuracy
150
+ - type: accuracy
151
+ value: 81.2
152
+ name: Lithuanian Test accuracy
153
+ - type: accuracy
154
+ value: 72.4
155
+ name: Galician Test accuracy
156
+ - type: accuracy
157
+ value: 57.0
158
+ name: Vietnamese Test accuracy
159
+ - type: accuracy
160
+ value: 80.2
161
+ name: Greek Test accuracy
162
+ - type: accuracy
163
+ value: 69.1
164
+ name: Catalan Test accuracy
165
+ - type: accuracy
166
+ value: 75.8
167
+ name: Czech Test accuracy
168
+ - type: accuracy
169
+ value: 52.7
170
+ name: Erzya Test accuracy
171
+ - type: accuracy
172
+ value: 50.8
173
+ name: Bhojpuri Test accuracy
174
+ - type: accuracy
175
+ value: 49.0
176
+ name: Thai Test accuracy
177
+ - type: accuracy
178
+ value: 77.9
179
+ name: Marathi Test accuracy
180
+ - type: accuracy
181
+ value: 66.8
182
+ name: Basque Test accuracy
183
+ - type: accuracy
184
+ value: 75.1
185
+ name: Slovak Test accuracy
186
+ - type: accuracy
187
+ value: 43.1
188
+ name: Kiche Test accuracy
189
+ - type: accuracy
190
+ value: 31.7
191
+ name: Yoruba Test accuracy
192
+ - type: accuracy
193
+ value: 48.6
194
+ name: Warlpiri Test accuracy
195
+ - type: accuracy
196
+ value: 79.5
197
+ name: Tamil Test accuracy
198
+ - type: accuracy
199
+ value: 34.1
200
+ name: Maltese Test accuracy
201
+ - type: accuracy
202
+ value: 58.5
203
+ name: Ancient Greek Test accuracy
204
+ - type: accuracy
205
+ value: 68.9
206
+ name: Icelandic Test accuracy
207
+ - type: accuracy
208
+ value: 33.6
209
+ name: Mbya Guarani Test accuracy
210
+ - type: accuracy
211
+ value: 60.5
212
+ name: Urdu Test accuracy
213
+ - type: accuracy
214
+ value: 69.6
215
+ name: Romanian Test accuracy
216
+ - type: accuracy
217
+ value: 71.3
218
+ name: Persian Test accuracy
219
+ - type: accuracy
220
+ value: 50.2
221
+ name: Apurina Test accuracy
222
+ - type: accuracy
223
+ value: 44.4
224
+ name: Japanese Test accuracy
225
+ - type: accuracy
226
+ value: 86.4
227
+ name: Hungarian Test accuracy
228
+ - type: accuracy
229
+ value: 63.2
230
+ name: Hindi Test accuracy
231
+ - type: accuracy
232
+ value: 36.3
233
+ name: Classical Chinese Test accuracy
234
+ - type: accuracy
235
+ value: 51.0
236
+ name: Komi Permyak Test accuracy
237
+ - type: accuracy
238
+ value: 59.5
239
+ name: Faroese Test accuracy
240
+ - type: accuracy
241
+ value: 38.3
242
+ name: Sanskrit Test accuracy
243
+ - type: accuracy
244
+ value: 65.4
245
+ name: Livvi Test accuracy
246
+ - type: accuracy
247
+ value: 64.4
248
+ name: Arabic Test accuracy
249
+ - type: accuracy
250
+ value: 38.9
251
+ name: Wolof Test accuracy
252
+ - type: accuracy
253
+ value: 72.4
254
+ name: Bulgarian Test accuracy
255
+ - type: accuracy
256
+ value: 49.1
257
+ name: Akuntsu Test accuracy
258
+ - type: accuracy
259
+ value: 23.3
260
+ name: Makurap Test accuracy
261
+ - type: accuracy
262
+ value: 46.5
263
+ name: Kangri Test accuracy
264
+ - type: accuracy
265
+ value: 55.4
266
+ name: Breton Test accuracy
267
+ - type: accuracy
268
+ value: 80.7
269
+ name: Telugu Test accuracy
270
+ - type: accuracy
271
+ value: 54.3
272
+ name: Cantonese Test accuracy
273
+ - type: accuracy
274
+ value: 42.9
275
+ name: Old Church Slavonic Test accuracy
276
+ - type: accuracy
277
+ value: 70.5
278
+ name: Karelian Test accuracy
279
+ - type: accuracy
280
+ value: 67.1
281
+ name: Upper Sorbian Test accuracy
282
+ - type: accuracy
283
+ value: 58.3
284
+ name: South Levantine Arabic Test accuracy
285
+ - type: accuracy
286
+ value: 47.6
287
+ name: Komi Zyrian Test accuracy
288
+ - type: accuracy
289
+ value: 60.3
290
+ name: Irish Test accuracy
291
+ - type: accuracy
292
+ value: 50.0
293
+ name: Nayini Test accuracy
294
+ - type: accuracy
295
+ value: 41.9
296
+ name: Munduruku Test accuracy
297
+ - type: accuracy
298
+ value: 37.5
299
+ name: Manx Test accuracy
300
+ - type: accuracy
301
+ value: 47.4
302
+ name: Skolt Sami Test accuracy
303
+ - type: accuracy
304
+ value: 71.3
305
+ name: Afrikaans Test accuracy
306
+ - type: accuracy
307
+ value: 53.4
308
+ name: Old Turkish Test accuracy
309
+ - type: accuracy
310
+ value: 53.6
311
+ name: Tupinamba Test accuracy
312
+ - type: accuracy
313
+ value: 76.9
314
+ name: Belarusian Test accuracy
315
+ - type: accuracy
316
+ value: 72.2
317
+ name: Serbian Test accuracy
318
+ - type: accuracy
319
+ value: 50.0
320
+ name: Moksha Test accuracy
321
+ - type: accuracy
322
+ value: 70.5
323
+ name: Western Armenian Test accuracy
324
+ - type: accuracy
325
+ value: 54.1
326
+ name: Scottish Gaelic Test accuracy
327
+ - type: accuracy
328
+ value: 50.0
329
+ name: Khunsari Test accuracy
330
+ - type: accuracy
331
+ value: 79.2
332
+ name: Hebrew Test accuracy
333
+ - type: accuracy
334
+ value: 70.8
335
+ name: Uyghur Test accuracy
336
+ - type: accuracy
337
+ value: 40.8
338
+ name: Chukchi Test accuracy
339
+ ---
340
+
341
+ # Model Card for XLM-RoBERTa base Universal Dependencies v2.8 POS tagging: Turkish
342
+
343
+
344
+
345
+ # Model Details
346
+
347
+ ## Model Description
348
+
349
+ - **Developed by:** Wietse de Vries
350
+ - **Shared by [Optional]:** Hugging Face
351
+ - **Model type:** Token Classification
352
+ - **Language(s) (NLP):** tr
353
+ - **License:** apache-2.0
354
+ - **Related Models:** xlm-roberla
355
+ - **Parent Model:**
356
+ - **Resources for more information:**
357
+ - [Associated Paper](https://aclanthology.org/2022.acl-long.529.pdf)
358
+ - [Space](https://huggingface.co/spaces/wietsedv/xpo)
359
+
360
+ # Uses
361
+
362
+
363
+ ## Direct Use
364
+
365
+ Token Classification
366
+
367
+ ## Downstream Use [Optional]
368
+
369
+ More information needed.
370
+
371
+ ## Out-of-Scope Use
372
+
373
+ The model should not be used to intentionally create hostile or alienating environments for people.
374
+ # Bias, Risks, and Limitations
375
+
376
+
377
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
378
+
379
+
380
+ ## Recommendations
381
+
382
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recomendations.
383
+
384
+ # Training Details
385
+
386
+ ## Training Data
387
+
388
+
389
+ See the associated [ Universal Dependencies v2.8 datasetcard] (https://huggingface.co/datasets/universal_dependencies)
390
+ for further details.
391
+
392
+ ## Training Procedure
393
+
394
+
395
+
396
+ ### Preprocessing
397
+
398
+ More information needed.
399
+
400
+ ### Speeds, Sizes, Times
401
+
402
+ More information needed.
403
+
404
+ # Evaluation
405
+
406
+
407
+ ## Testing Data, Factors & Metrics
408
+
409
+ ### Testing Data
410
+
411
+ See the associated [ Universal Dependencies v2.8 datasetcard](https://huggingface.co/datasets/universal_dependencies)
412
+ for further details.
413
+
414
+ ### Factors
415
+
416
+
417
+ ### Metrics
418
+
419
+ Accuracy
420
+
421
+ ## Results
422
+ <details>
423
+ <summary> Click to expand </summary>
424
+
425
  - type: accuracy
426
  name: English Test accuracy
427
  value: 74.4
 
713
  - type: accuracy
714
  name: Belarusian Test accuracy
715
  value: 76.9
716
+ - name: Serbian Test accuracy
 
717
  value: 72.2
718
+
719
+ - name: Moksha Test accuracy
720
  value: 50.0
721
+ - name: Western Armenian Test accuracy
 
722
  value: 70.5
723
  - type: accuracy
724
  name: Scottish Gaelic Test accuracy
 
735
  - type: accuracy
736
  name: Chukchi Test accuracy
737
  value: 40.8
738
+ </details>
739
+
740
+ # Model Examination
741
+
742
+ More information needed
743
+
744
+ # Environmental Impact
745
+
746
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
747
+
748
+ - **Hardware Type:** More information needed
749
+ - **Hours used:** More information needed
750
+ - **Cloud Provider:** More information needed
751
+ - **Compute Region:** More information needed
752
+ - **Carbon Emitted:** More information needed
753
+
754
+ # Technical Specifications [optional]
755
+
756
+ ## Model Architecture and Objective
757
+
758
+ More information needed
759
+
760
+ ## Compute Infrastructure
761
+
762
+ More information needed
763
+
764
+ ### Hardware
765
+
766
+ More information needed
767
+
768
+ ### Software
769
+
770
+ More information needed
771
+
772
+ # Citation
773
+
774
+
775
+ **BibTeX:**
776
+
777
+ More information needed
778
+
779
+ **APA:**
780
+
781
+ More information needed
782
+
783
+ # Glossary [optional]
784
+ More information needed
785
+
786
+ # More Information [optional]
787
+
788
+ More information needed
789
+
790
+ # Model Card Authors [optional]
791
+
792
+ Wietse de Vries in collaboration with Ezi Ozoani and the Hugging Face team.
793
+
794
+ # Model Card Contact
795
+
796
+ More information needed
797
+
798
+ # How to Get Started with the Model
799
+
800
+ Use the code below to get started with the model.
801
+
802
+ <details>
803
+ <summary> Click to expand </summary>
804
 
 
 
 
 
 
 
 
 
 
805
  ```python
806
  from transformers import AutoTokenizer, AutoModelForTokenClassification
807
 
808
  tokenizer = AutoTokenizer.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-tr")
809
+
810
  model = AutoModelForTokenClassification.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-tr")
811
+
812
  ```
813
+
814
+
815
+ </details>
816
+