KennethEnevoldsen commited on
Commit
e7dba91
1 Parent(s): 56fa557

Updated readme

Browse files
Files changed (2) hide show
  1. README.md +21 -11
  2. meta.json +423 -412
README.md CHANGED
@@ -3,19 +3,20 @@ tags:
3
  - spacy
4
  - dacy
5
  - danish
6
- - named entity recognition
7
  - pos tagging
 
8
  - lemmatization
9
  - dependency parsing
 
10
  - coreference resolution
11
  - named entity linking
12
  - named entity disambiguation
13
- - token-classification
14
  language:
15
  - da
16
  license: apache-2.0
17
  model-index:
18
- - name: da_dacy_medium_trf
19
  results:
20
  - task:
21
  name: NER
@@ -70,6 +71,18 @@ model-index:
70
  split: test
71
  type: universal_dependencies
72
  config: da_ddt
 
 
 
 
 
 
 
 
 
 
 
 
73
  - task:
74
  name: UNLABELED_DEPENDENCIES
75
  type: token-classification
@@ -137,16 +150,12 @@ model-index:
137
  library_name: spacy
138
  datasets:
139
  - universal_dependencies
140
- - alexandrainst/dacoref
141
  - dane
 
142
  metrics:
143
  - accuracy
144
  ---
145
 
146
-
147
-
148
-
149
-
150
  <a href="https://github.com/centre-for-humanities-computing/Dacy"><img src="https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png" width="175" height="175" align="right" /></a>
151
 
152
  # DaCy medium
@@ -157,6 +166,7 @@ parsing for Danish on the Danish Dependency treebank as well as competitive perf
157
  To read more check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results.
158
  DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.
159
 
 
160
  | Feature | Description |
161
  | --- | --- |
162
  | **Name** | `da_dacy_medium_trf` |
@@ -166,7 +176,7 @@ DaCy also contains guides on usage of the package as well as behavioural test fo
166
  | **Components** | `transformer`, `tagger`, `morphologizer`, `trainable_lemmatizer`, `parser`, `ner`, `coref`, `span_resolver`, `span_cleaner`, `entity_linker` |
167
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
168
  | **Sources** | [UD Danish DDT v2.11](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://huggingface.co/datasets/dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[DaCoref](https://huggingface.co/datasets/alexandrainst/dacoref) (Buch-Kromann, Matthias)<br />[DaNED](https://danlp-alexandra.readthedocs.io/en/stable/docs/datasets.html#daned) (Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & Søgaard, A.)<br />[vesteinn/DanskBERT](https://huggingface.co/vesteinn/DanskBERT) (Vésteinn Snæbjarnarson) |
169
- | **License** | `Apache-2.0 License` |
170
  | **Author** | [Kenneth Enevoldsen](https://chcaa.io/#/) |
171
 
172
  ### Label Scheme
@@ -184,8 +194,7 @@ DaCy also contains guides on usage of the package as well as behavioural test fo
184
 
185
  </details>
186
 
187
- ### Performance Metrics
188
-
189
 
190
  | Type | Score |
191
  | --- | --- |
@@ -207,6 +216,7 @@ DaCy also contains guides on usage of the package as well as behavioural test fo
207
  | `ENTS_P` | 87.08 |
208
  | `ENTS_R` | 84.59 |
209
  | `ENTS_F` | 85.82 |
 
210
  | `COREF_LEA_F1` | 41.18 |
211
  | `COREF_LEA_PRECISION` | 48.89 |
212
  | `COREF_LEA_RECALL` | 35.58 |
 
3
  - spacy
4
  - dacy
5
  - danish
6
+ - token-classification
7
  - pos tagging
8
+ - morphological analysis
9
  - lemmatization
10
  - dependency parsing
11
+ - named entity recognition
12
  - coreference resolution
13
  - named entity linking
14
  - named entity disambiguation
 
15
  language:
16
  - da
17
  license: apache-2.0
18
  model-index:
19
+ - name: da_dacy_medium_trf-0.2.0
20
  results:
21
  - task:
22
  name: NER
 
71
  split: test
72
  type: universal_dependencies
73
  config: da_ddt
74
+ - task:
75
+ name: LEMMA
76
+ type: token-classification
77
+ metrics:
78
+ - name: Lemma Accuracy
79
+ type: accuracy
80
+ value: 0.9419805438
81
+ dataset:
82
+ name: UD Danish DDT
83
+ split: test
84
+ type: universal_dependencies
85
+ config: da_ddt
86
  - task:
87
  name: UNLABELED_DEPENDENCIES
88
  type: token-classification
 
150
  library_name: spacy
151
  datasets:
152
  - universal_dependencies
 
153
  - dane
154
+ - alexandrainst/dacoref
155
  metrics:
156
  - accuracy
157
  ---
158
 
 
 
 
 
159
  <a href="https://github.com/centre-for-humanities-computing/Dacy"><img src="https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png" width="175" height="175" align="right" /></a>
160
 
161
  # DaCy medium
 
166
  To read more check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results.
167
  DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.
168
 
169
+
170
  | Feature | Description |
171
  | --- | --- |
172
  | **Name** | `da_dacy_medium_trf` |
 
176
  | **Components** | `transformer`, `tagger`, `morphologizer`, `trainable_lemmatizer`, `parser`, `ner`, `coref`, `span_resolver`, `span_cleaner`, `entity_linker` |
177
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
178
  | **Sources** | [UD Danish DDT v2.11](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://huggingface.co/datasets/dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[DaCoref](https://huggingface.co/datasets/alexandrainst/dacoref) (Buch-Kromann, Matthias)<br />[DaNED](https://danlp-alexandra.readthedocs.io/en/stable/docs/datasets.html#daned) (Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & Søgaard, A.)<br />[vesteinn/DanskBERT](https://huggingface.co/vesteinn/DanskBERT) (Vésteinn Snæbjarnarson) |
179
+ | **License** | `Apache-2.0` |
180
  | **Author** | [Kenneth Enevoldsen](https://chcaa.io/#/) |
181
 
182
  ### Label Scheme
 
194
 
195
  </details>
196
 
197
+ ### Accuracy
 
198
 
199
  | Type | Score |
200
  | --- | --- |
 
216
  | `ENTS_P` | 87.08 |
217
  | `ENTS_R` | 84.59 |
218
  | `ENTS_F` | 85.82 |
219
+ | `LEMMA_ACC` | 94.20 |
220
  | `COREF_LEA_F1` | 41.18 |
221
  | `COREF_LEA_PRECISION` | 48.89 |
222
  | `COREF_LEA_RECALL` | 35.58 |
meta.json CHANGED
@@ -1,23 +1,25 @@
1
  {
2
- "lang": "da",
3
- "name": "dacy_medium_trf",
4
- "version": "0.2.0",
5
- "description": "\n<a href=\"https://github.com/centre-for-humanities-computing/Dacy\"><img src=\"https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png\" width=\"175\" height=\"175\" align=\"right\" /></a>\n\n# DaCy medium\n\nDaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.\nDaCy's largest pipeline has achieved State-of-the-Art performance on parts-of-speech tagging and dependency \nparsing for Danish on the DaNE dataset. To read more check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results. \nDaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.\n",
6
- "author": "Kenneth Enevoldsen",
7
- "email": "[email protected]",
8
- "url": "https://chcaa.io/#/",
9
- "license": "Apache-2.0",
10
- "spacy_version": ">=3.5.2,<3.6.0",
11
- "spacy_git_version": "Unknown",
12
- "vectors": {
13
- "width": 0,
14
- "vectors": 0,
15
- "keys": 0,
16
- "name": null
17
  },
18
- "labels": {
19
- "transformer": [],
20
- "tagger": [
 
 
21
  "ADJ",
22
  "ADP",
23
  "ADV",
@@ -36,7 +38,7 @@
36
  "VERB",
37
  "X"
38
  ],
39
- "morphologizer": [
40
  "AdpType=Prep|POS=ADP",
41
  "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN",
42
  "Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act",
@@ -196,7 +198,7 @@
196
  "Case=Gen|POS=NOUN",
197
  "POS=AUX|Tense=Pres|VerbForm=Part"
198
  ],
199
- "parser": [
200
  "ROOT",
201
  "acl:relcl",
202
  "advcl",
@@ -230,17 +232,23 @@
230
  "punct",
231
  "xcomp"
232
  ],
233
- "ner": [
234
  "LOC",
235
  "MISC",
236
  "ORG",
237
  "PER"
238
  ],
239
- "coref": [],
240
- "span_resolver": [],
241
- "entity_linker": []
 
 
 
 
 
 
242
  },
243
- "pipeline": [
244
  "transformer",
245
  "tagger",
246
  "morphologizer",
@@ -252,7 +260,7 @@
252
  "span_cleaner",
253
  "entity_linker"
254
  ],
255
- "components": [
256
  "transformer",
257
  "tagger",
258
  "morphologizer",
@@ -264,411 +272,414 @@
264
  "span_cleaner",
265
  "entity_linker"
266
  ],
267
- "disabled": [],
268
- "requirements": [
269
- "spacy-transformers>=1.2.3,<1.3.0",
270
- "spacy-experimental>=0.6.2,<0.7.0"
 
 
271
  ],
272
- "performance": {
273
- "token_acc": 0.9992023928,
274
- "token_p": 0.9970089731,
275
- "token_r": 0.9977052779,
276
- "token_f": 0.9973570039,
277
- "sents_p": 0.9842105263,
278
- "sents_r": 0.992920354,
279
- "sents_f": 0.9885462555,
280
- "tag_acc": 0.9847290149,
281
- "pos_acc": 0.985677928,
282
- "morph_acc": 0.9814371257,
283
- "morph_micro_p": 0.9910058542,
284
- "morph_micro_r": 0.9876942662,
285
- "morph_micro_f": 0.989347289,
286
- "morph_per_feat": {
287
- "NumType": {
288
- "p": 0.987654321,
289
- "r": 0.9302325581,
290
- "f": 0.9580838323
291
- },
292
- "Degree": {
293
- "p": 0.9894736842,
294
- "r": 0.9715762274,
295
- "f": 0.9804432855
296
- },
297
- "Number": {
298
- "p": 0.9884148064,
299
- "r": 0.9859075536,
300
- "f": 0.987159588
301
- },
302
- "Definite": {
303
- "p": 0.9858490566,
304
- "r": 0.9837398374,
305
- "f": 0.9847933176
306
- },
307
- "Gender": {
308
- "p": 0.9869901547,
309
- "r": 0.9838766211,
310
- "f": 0.9854309286
311
- },
312
- "Mood": {
313
- "p": 0.9971126083,
314
- "r": 0.9942418426,
315
- "f": 0.9956751562
316
- },
317
- "Tense": {
318
- "p": 0.9906469213,
319
- "r": 0.9906469213,
320
- "f": 0.9906469213
321
- },
322
- "VerbForm": {
323
- "p": 0.9924670433,
324
- "r": 0.9918444166,
325
- "f": 0.9921556323
326
- },
327
- "Voice": {
328
- "p": 0.997012696,
329
- "r": 0.9955257271,
330
- "f": 0.9962686567
331
- },
332
- "AdpType": {
333
- "p": 0.9990689013,
334
- "r": 0.9972118959,
335
- "f": 0.9981395349
336
- },
337
- "PronType": {
338
- "p": 0.9954914337,
339
- "r": 0.9963898917,
340
- "f": 0.9959404601
341
- },
342
- "Case": {
343
- "p": 0.9968652038,
344
- "r": 0.9860465116,
345
- "f": 0.9914263445
346
- },
347
- "Person": {
348
- "p": 0.9930555556,
349
- "r": 0.9913344887,
350
- "f": 0.9921942758
351
- },
352
- "Number[psor]": {
353
- "p": 0.987804878,
354
- "r": 1.0,
355
- "f": 0.9938650307
356
- },
357
- "Poss": {
358
- "p": 0.987804878,
359
- "r": 1.0,
360
- "f": 0.9938650307
361
- },
362
- "PartType": {
363
- "p": 1.0,
364
- "r": 0.9962406015,
365
- "f": 0.9981167608
366
- },
367
- "Polite": {
368
- "p": 0.6666666667,
369
- "r": 0.6666666667,
370
- "f": 0.6666666667
371
- },
372
- "Reflex": {
373
- "p": 1.0,
374
- "r": 1.0,
375
- "f": 1.0
376
- },
377
- "Foreign": {
378
- "p": 0.5,
379
- "r": 0.2,
380
- "f": 0.2857142857
381
- },
382
- "Style": {
383
- "p": 1.0,
384
- "r": 1.0,
385
- "f": 1.0
386
- },
387
- "Abbr": {
388
- "p": 0.6666666667,
389
- "r": 1.0,
390
- "f": 0.8
391
  }
392
  },
393
- "dep_uas": 0.9083920564,
394
- "dep_las": 0.883349834,
395
- "dep_las_per_type": {
396
- "nummod": {
397
- "p": 0.7948717949,
398
- "r": 0.8230088496,
399
- "f": 0.8086956522
400
- },
401
- "amod": {
402
- "p": 0.897810219,
403
- "r": 0.9027522936,
404
- "f": 0.9002744739
405
- },
406
- "nmod": {
407
- "p": 0.7712418301,
408
- "r": 0.7729257642,
409
- "f": 0.772082879
410
- },
411
- "nsubj": {
412
- "p": 0.9510638298,
413
- "r": 0.946031746,
414
- "f": 0.9485411141
415
- },
416
- "flat": {
417
- "p": 0.9285714286,
418
- "r": 0.9680851064,
419
- "f": 0.9479166667
420
- },
421
- "cc": {
422
- "p": 0.8681672026,
423
- "r": 0.8940397351,
424
- "f": 0.88091354
425
- },
426
- "conj": {
427
- "p": 0.8862275449,
428
- "r": 0.8554913295,
429
- "f": 0.8705882353
430
- },
431
- "root": {
432
- "p": 0.926056338,
433
- "r": 0.9309734513,
434
- "f": 0.9285083848
435
- },
436
- "advmod": {
437
- "p": 0.8871715611,
438
- "r": 0.8605697151,
439
- "f": 0.8736681887
440
- },
441
- "mark": {
442
- "p": 0.9148471616,
443
- "r": 0.9331848552,
444
- "f": 0.9239250276
445
- },
446
- "aux": {
447
- "p": 0.9875389408,
448
- "r": 0.9753846154,
449
- "f": 0.9814241486
450
- },
451
- "ccomp": {
452
- "p": 0.7764705882,
453
- "r": 0.835443038,
454
- "f": 0.8048780488
455
- },
456
- "case": {
457
- "p": 0.9348986126,
458
- "r": 0.9192025184,
459
- "f": 0.926984127
460
- },
461
- "det": {
462
- "p": 0.9409448819,
463
- "r": 0.9637096774,
464
- "f": 0.9521912351
465
- },
466
- "obl": {
467
- "p": 0.8476821192,
468
- "r": 0.8114104596,
469
- "f": 0.8291497976
470
- },
471
- "nmod:poss": {
472
- "p": 0.8181818182,
473
- "r": 0.8256880734,
474
- "f": 0.8219178082
475
- },
476
- "obj": {
477
- "p": 0.8943533698,
478
- "r": 0.9352380952,
479
- "f": 0.9143389199
480
- },
481
- "cop": {
482
- "p": 0.8944099379,
483
- "r": 0.8834355828,
484
- "f": 0.8888888889
485
- },
486
- "acl:relcl": {
487
- "p": 0.8343195266,
488
- "r": 0.7704918033,
489
- "f": 0.8011363636
490
- },
491
- "advcl": {
492
- "p": 0.6742857143,
493
- "r": 0.7564102564,
494
- "f": 0.7129909366
495
- },
496
- "dep": {
497
- "p": 0.1136363636,
498
- "r": 0.3333333333,
499
- "f": 0.1694915254
500
- },
501
- "compound:prt": {
502
- "p": 0.6666666667,
503
- "r": 0.5882352941,
504
- "f": 0.625
505
- },
506
- "fixed": {
507
- "p": 0.9473684211,
508
- "r": 0.8709677419,
509
- "f": 0.9075630252
510
- },
511
- "iobj": {
512
- "p": 0.7692307692,
513
- "r": 0.6666666667,
514
- "f": 0.7142857143
515
- },
516
- "appos": {
517
- "p": 0.8181818182,
518
- "r": 0.7105263158,
519
- "f": 0.7605633803
520
- },
521
- "obl:tmod": {
522
- "p": 0.5,
523
- "r": 0.3125,
524
- "f": 0.3846153846
525
- },
526
- "advmod:lmod": {
527
- "p": 0.7678571429,
528
- "r": 0.8958333333,
529
- "f": 0.8269230769
530
- },
531
- "xcomp": {
532
- "p": 0.8913043478,
533
- "r": 0.640625,
534
- "f": 0.7454545455
535
- },
536
- "expl": {
537
- "p": 0.9230769231,
538
- "r": 0.9230769231,
539
- "f": 0.9230769231
540
- },
541
- "list": {
542
- "p": 0.5714285714,
543
- "r": 0.2352941176,
544
- "f": 0.3333333333
545
- },
546
- "obl:lmod": {
547
- "p": 0.25,
548
- "r": 0.3333333333,
549
- "f": 0.2857142857
550
- },
551
- "parataxis": {
552
- "p": 0.0,
553
- "r": 0.0,
554
- "f": 0.0
555
- },
556
- "orphan": {
557
- "p": 0.0,
558
- "r": 0.0,
559
- "f": 0.0
560
- },
561
- "vocative": {
562
- "p": 0.0,
563
- "r": 0.0,
564
- "f": 0.0
565
- },
566
- "discourse": {
567
- "p": 0.0,
568
- "r": 0.0,
569
- "f": 0.0
570
- },
571
- "dislocated": {
572
- "p": 0.0,
573
- "r": 0.0,
574
- "f": 0.0
575
- },
576
- "compound": {
577
- "p": 0.0,
578
- "r": 0.0,
579
- "f": 0.0
580
  }
581
  },
582
- "ents_p": 0.8708487085,
583
- "ents_r": 0.8458781362,
584
- "ents_f": 0.8581818182,
585
- "ents_per_type": {
586
- "LOC": {
587
- "p": 0.854368932,
588
- "r": 0.9166666667,
589
- "f": 0.8844221106
590
- },
591
- "PER": {
592
- "p": 0.9100529101,
593
- "r": 0.9555555556,
594
- "f": 0.9322493225
595
- },
596
- "MISC": {
597
- "p": 0.8301886792,
598
- "r": 0.7272727273,
599
- "f": 0.7753303965
600
- },
601
- "ORG": {
602
- "p": 0.8611111111,
603
- "r": 0.7701863354,
604
- "f": 0.8131147541
605
  }
606
  },
607
- "coref_lea_f1": 0.4118366346,
608
- "coref_lea_precision": 0.4889169083,
609
- "coref_lea_recall": 0.3557507008,
610
- "nel_score": 0.801242236,
611
- "nel_score_desc": "micro F",
612
- "nel_micro_p": 0.9923076923,
613
- "nel_micro_r": 0.671875,
614
- "nel_micro_f": 0.801242236,
615
- "nel_macro_p": 0.993902439,
616
- "nel_macro_r": 0.6598989464,
617
- "nel_macro_f": 0.7815238616,
618
- "nel_f_per_type": {
619
- "MISC": {
620
- "p": 1.0,
621
- "r": 0.4117647059,
622
- "f": 0.5833333333
623
- },
624
- "PER": {
625
- "p": 1.0,
626
- "r": 0.7540983607,
627
- "f": 0.8598130841
628
- },
629
- "LOC": {
630
- "p": 1.0,
631
- "r": 0.8285714286,
632
- "f": 0.90625
633
- },
634
- "ORG": {
635
- "p": 0.9756097561,
636
- "r": 0.6451612903,
637
- "f": 0.7766990291
 
638
  }
639
  }
640
  },
641
- "sources": [
642
  {
643
- "name": "UD Danish DDT v2.11",
644
- "url": "https://github.com/UniversalDependencies/UD_Danish-DDT",
645
- "license": "CC BY-SA 4.0",
646
- "author": "Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
647
  },
648
  {
649
- "name": "DaNE",
650
- "url": "https://huggingface.co/datasets/dane",
651
- "license": "CC BY-SA 4.0",
652
- "author": "Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
653
  },
654
  {
655
- "name": "DaCoref",
656
- "url": "https://huggingface.co/datasets/alexandrainst/dacoref",
657
- "license": "CC BY-SA 4.0",
658
- "author": "Buch-Kromann, Matthias"
659
  },
660
  {
661
- "name": "DaNED",
662
- "url": "https://danlp-alexandra.readthedocs.io/en/stable/docs/datasets.html#daned",
663
- "license": "CC BY-SA 4.0",
664
- "author": "Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & S\u00f8gaard, A."
665
  },
666
  {
667
- "name": "vesteinn/DanskBERT",
668
- "author": "V\u00e9steinn Sn\u00e6bjarnarson",
669
- "url": "https://huggingface.co/vesteinn/DanskBERT",
670
- "license": "MIT"
671
  }
672
  ],
673
- "notes": "\n\n### Training\nThis model was trained using [spaCy](https://spacy.io) and logged to [Weights & Biases](https://wandb.ai/kenevoldsen/dacy-v0.2.0). You can find all the training logs [here](https://wandb.ai/kenevoldsen/dacy-v0.2.0)."
674
  }
 
1
  {
2
+ "lang":"da",
3
+ "name":"dacy_medium_trf",
4
+ "version":"0.2.0",
5
+ "description":"\n<a href=\"https://github.com/centre-for-humanities-computing/Dacy\"><img src=\"https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png\" width=\"175\" height=\"175\" align=\"right\" /></a>\n\n# DaCy medium\n\nDaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.\nDaCy's largest pipeline has achieved State-of-the-Art performance on parts-of-speech tagging and dependency \nparsing for Danish on the Danish Dependency treebank as well as competitive performance on named entity recognition, named entity disambiguation and coreference resolution. \nTo read more check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results. \nDaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.\n",
6
+ "author":"Kenneth Enevoldsen",
7
+ "email":"[email protected]",
8
+ "url":"https://chcaa.io/#/",
9
+ "license":"Apache-2.0",
10
+ "spacy_version":">=3.5.2,<3.6.0",
11
+ "spacy_git_version":"Unknown",
12
+ "vectors":{
13
+ "width":0,
14
+ "vectors":0,
15
+ "keys":0,
16
+ "name":null
17
  },
18
+ "labels":{
19
+ "transformer":[
20
+
21
+ ],
22
+ "tagger":[
23
  "ADJ",
24
  "ADP",
25
  "ADV",
 
38
  "VERB",
39
  "X"
40
  ],
41
+ "morphologizer":[
42
  "AdpType=Prep|POS=ADP",
43
  "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN",
44
  "Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act",
 
198
  "Case=Gen|POS=NOUN",
199
  "POS=AUX|Tense=Pres|VerbForm=Part"
200
  ],
201
+ "parser":[
202
  "ROOT",
203
  "acl:relcl",
204
  "advcl",
 
232
  "punct",
233
  "xcomp"
234
  ],
235
+ "ner":[
236
  "LOC",
237
  "MISC",
238
  "ORG",
239
  "PER"
240
  ],
241
+ "coref":[
242
+
243
+ ],
244
+ "span_resolver":[
245
+
246
+ ],
247
+ "entity_linker":[
248
+
249
+ ]
250
  },
251
+ "pipeline":[
252
  "transformer",
253
  "tagger",
254
  "morphologizer",
 
260
  "span_cleaner",
261
  "entity_linker"
262
  ],
263
+ "components":[
264
  "transformer",
265
  "tagger",
266
  "morphologizer",
 
272
  "span_cleaner",
273
  "entity_linker"
274
  ],
275
+ "disabled":[
276
+
277
+ ],
278
+ "requirements":[
279
+ "spacy-experimental>=0.6.2,<0.7.0",
280
+ "spacy-transformers>=1.2.3,<1.3.0"
281
  ],
282
+ "performance":{
283
+ "token_acc":0.9992023928,
284
+ "token_p":0.9970089731,
285
+ "token_r":0.9977052779,
286
+ "token_f":0.9973570039,
287
+ "sents_p":0.9842105263,
288
+ "sents_r":0.992920354,
289
+ "sents_f":0.9885462555,
290
+ "tag_acc":0.9847290149,
291
+ "pos_acc":0.985677928,
292
+ "morph_acc":0.9814371257,
293
+ "morph_micro_p":0.9910058542,
294
+ "morph_micro_r":0.9876942662,
295
+ "morph_micro_f":0.989347289,
296
+ "morph_per_feat":{
297
+ "NumType":{
298
+ "p":0.987654321,
299
+ "r":0.9302325581,
300
+ "f":0.9580838323
301
+ },
302
+ "Degree":{
303
+ "p":0.9894736842,
304
+ "r":0.9715762274,
305
+ "f":0.9804432855
306
+ },
307
+ "Number":{
308
+ "p":0.9884148064,
309
+ "r":0.9859075536,
310
+ "f":0.987159588
311
+ },
312
+ "Definite":{
313
+ "p":0.9858490566,
314
+ "r":0.9837398374,
315
+ "f":0.9847933176
316
+ },
317
+ "Gender":{
318
+ "p":0.9869901547,
319
+ "r":0.9838766211,
320
+ "f":0.9854309286
321
+ },
322
+ "Mood":{
323
+ "p":0.9971126083,
324
+ "r":0.9942418426,
325
+ "f":0.9956751562
326
+ },
327
+ "Tense":{
328
+ "p":0.9906469213,
329
+ "r":0.9906469213,
330
+ "f":0.9906469213
331
+ },
332
+ "VerbForm":{
333
+ "p":0.9924670433,
334
+ "r":0.9918444166,
335
+ "f":0.9921556323
336
+ },
337
+ "Voice":{
338
+ "p":0.997012696,
339
+ "r":0.9955257271,
340
+ "f":0.9962686567
341
+ },
342
+ "AdpType":{
343
+ "p":0.9990689013,
344
+ "r":0.9972118959,
345
+ "f":0.9981395349
346
+ },
347
+ "PronType":{
348
+ "p":0.9954914337,
349
+ "r":0.9963898917,
350
+ "f":0.9959404601
351
+ },
352
+ "Case":{
353
+ "p":0.9968652038,
354
+ "r":0.9860465116,
355
+ "f":0.9914263445
356
+ },
357
+ "Person":{
358
+ "p":0.9930555556,
359
+ "r":0.9913344887,
360
+ "f":0.9921942758
361
+ },
362
+ "Number[psor]":{
363
+ "p":0.987804878,
364
+ "r":1.0,
365
+ "f":0.9938650307
366
+ },
367
+ "Poss":{
368
+ "p":0.987804878,
369
+ "r":1.0,
370
+ "f":0.9938650307
371
+ },
372
+ "PartType":{
373
+ "p":1.0,
374
+ "r":0.9962406015,
375
+ "f":0.9981167608
376
+ },
377
+ "Polite":{
378
+ "p":0.6666666667,
379
+ "r":0.6666666667,
380
+ "f":0.6666666667
381
+ },
382
+ "Reflex":{
383
+ "p":1.0,
384
+ "r":1.0,
385
+ "f":1.0
386
+ },
387
+ "Foreign":{
388
+ "p":0.5,
389
+ "r":0.2,
390
+ "f":0.2857142857
391
+ },
392
+ "Style":{
393
+ "p":1.0,
394
+ "r":1.0,
395
+ "f":1.0
396
+ },
397
+ "Abbr":{
398
+ "p":0.6666666667,
399
+ "r":1.0,
400
+ "f":0.8
401
  }
402
  },
403
+ "dep_uas":0.9083920564,
404
+ "dep_las":0.883349834,
405
+ "dep_las_per_type":{
406
+ "nummod":{
407
+ "p":0.7948717949,
408
+ "r":0.8230088496,
409
+ "f":0.8086956522
410
+ },
411
+ "amod":{
412
+ "p":0.897810219,
413
+ "r":0.9027522936,
414
+ "f":0.9002744739
415
+ },
416
+ "nmod":{
417
+ "p":0.7712418301,
418
+ "r":0.7729257642,
419
+ "f":0.772082879
420
+ },
421
+ "nsubj":{
422
+ "p":0.9510638298,
423
+ "r":0.946031746,
424
+ "f":0.9485411141
425
+ },
426
+ "flat":{
427
+ "p":0.9285714286,
428
+ "r":0.9680851064,
429
+ "f":0.9479166667
430
+ },
431
+ "cc":{
432
+ "p":0.8681672026,
433
+ "r":0.8940397351,
434
+ "f":0.88091354
435
+ },
436
+ "conj":{
437
+ "p":0.8862275449,
438
+ "r":0.8554913295,
439
+ "f":0.8705882353
440
+ },
441
+ "root":{
442
+ "p":0.926056338,
443
+ "r":0.9309734513,
444
+ "f":0.9285083848
445
+ },
446
+ "advmod":{
447
+ "p":0.8871715611,
448
+ "r":0.8605697151,
449
+ "f":0.8736681887
450
+ },
451
+ "mark":{
452
+ "p":0.9148471616,
453
+ "r":0.9331848552,
454
+ "f":0.9239250276
455
+ },
456
+ "aux":{
457
+ "p":0.9875389408,
458
+ "r":0.9753846154,
459
+ "f":0.9814241486
460
+ },
461
+ "ccomp":{
462
+ "p":0.7764705882,
463
+ "r":0.835443038,
464
+ "f":0.8048780488
465
+ },
466
+ "case":{
467
+ "p":0.9348986126,
468
+ "r":0.9192025184,
469
+ "f":0.926984127
470
+ },
471
+ "det":{
472
+ "p":0.9409448819,
473
+ "r":0.9637096774,
474
+ "f":0.9521912351
475
+ },
476
+ "obl":{
477
+ "p":0.8476821192,
478
+ "r":0.8114104596,
479
+ "f":0.8291497976
480
+ },
481
+ "nmod:poss":{
482
+ "p":0.8181818182,
483
+ "r":0.8256880734,
484
+ "f":0.8219178082
485
+ },
486
+ "obj":{
487
+ "p":0.8943533698,
488
+ "r":0.9352380952,
489
+ "f":0.9143389199
490
+ },
491
+ "cop":{
492
+ "p":0.8944099379,
493
+ "r":0.8834355828,
494
+ "f":0.8888888889
495
+ },
496
+ "acl:relcl":{
497
+ "p":0.8343195266,
498
+ "r":0.7704918033,
499
+ "f":0.8011363636
500
+ },
501
+ "advcl":{
502
+ "p":0.6742857143,
503
+ "r":0.7564102564,
504
+ "f":0.7129909366
505
+ },
506
+ "dep":{
507
+ "p":0.1136363636,
508
+ "r":0.3333333333,
509
+ "f":0.1694915254
510
+ },
511
+ "compound:prt":{
512
+ "p":0.6666666667,
513
+ "r":0.5882352941,
514
+ "f":0.625
515
+ },
516
+ "fixed":{
517
+ "p":0.9473684211,
518
+ "r":0.8709677419,
519
+ "f":0.9075630252
520
+ },
521
+ "iobj":{
522
+ "p":0.7692307692,
523
+ "r":0.6666666667,
524
+ "f":0.7142857143
525
+ },
526
+ "appos":{
527
+ "p":0.8181818182,
528
+ "r":0.7105263158,
529
+ "f":0.7605633803
530
+ },
531
+ "obl:tmod":{
532
+ "p":0.5,
533
+ "r":0.3125,
534
+ "f":0.3846153846
535
+ },
536
+ "advmod:lmod":{
537
+ "p":0.7678571429,
538
+ "r":0.8958333333,
539
+ "f":0.8269230769
540
+ },
541
+ "xcomp":{
542
+ "p":0.8913043478,
543
+ "r":0.640625,
544
+ "f":0.7454545455
545
+ },
546
+ "expl":{
547
+ "p":0.9230769231,
548
+ "r":0.9230769231,
549
+ "f":0.9230769231
550
+ },
551
+ "list":{
552
+ "p":0.5714285714,
553
+ "r":0.2352941176,
554
+ "f":0.3333333333
555
+ },
556
+ "obl:lmod":{
557
+ "p":0.25,
558
+ "r":0.3333333333,
559
+ "f":0.2857142857
560
+ },
561
+ "parataxis":{
562
+ "p":0.0,
563
+ "r":0.0,
564
+ "f":0.0
565
+ },
566
+ "orphan":{
567
+ "p":0.0,
568
+ "r":0.0,
569
+ "f":0.0
570
+ },
571
+ "vocative":{
572
+ "p":0.0,
573
+ "r":0.0,
574
+ "f":0.0
575
+ },
576
+ "discourse":{
577
+ "p":0.0,
578
+ "r":0.0,
579
+ "f":0.0
580
+ },
581
+ "dislocated":{
582
+ "p":0.0,
583
+ "r":0.0,
584
+ "f":0.0
585
+ },
586
+ "compound":{
587
+ "p":0.0,
588
+ "r":0.0,
589
+ "f":0.0
590
  }
591
  },
592
+ "ents_p":0.8708487085,
593
+ "ents_r":0.8458781362,
594
+ "ents_f":0.8581818182,
595
+ "ents_per_type":{
596
+ "MISC":{
597
+ "p":0.8301886792,
598
+ "r":0.7272727273,
599
+ "f":0.7753303965
600
+ },
601
+ "ORG":{
602
+ "p":0.8611111111,
603
+ "r":0.7701863354,
604
+ "f":0.8131147541
605
+ },
606
+ "LOC":{
607
+ "p":0.854368932,
608
+ "r":0.9166666667,
609
+ "f":0.8844221106
610
+ },
611
+ "PER":{
612
+ "p":0.9100529101,
613
+ "r":0.9555555556,
614
+ "f":0.9322493225
615
  }
616
  },
617
+ "lemma_acc":0.9419805438,
618
+ "coref_lea_f1":0.4118366346,
619
+ "coref_lea_precision":0.4889169083,
620
+ "coref_lea_recall":0.3557507008,
621
+ "nel_score":0.801242236,
622
+ "nel_score_desc":"micro F",
623
+ "nel_micro_p":0.9923076923,
624
+ "nel_micro_r":0.671875,
625
+ "nel_micro_f":0.801242236,
626
+ "nel_macro_p":0.993902439,
627
+ "nel_macro_r":0.6598989464,
628
+ "nel_macro_f":0.7815238616,
629
+ "nel_f_per_type":{
630
+ "MISC":{
631
+ "p":1.0,
632
+ "r":0.4117647059,
633
+ "f":0.5833333333
634
+ },
635
+ "PER":{
636
+ "p":1.0,
637
+ "r":0.7540983607,
638
+ "f":0.8598130841
639
+ },
640
+ "LOC":{
641
+ "p":1.0,
642
+ "r":0.8285714286,
643
+ "f":0.90625
644
+ },
645
+ "ORG":{
646
+ "p":0.9756097561,
647
+ "r":0.6451612903,
648
+ "f":0.7766990291
649
  }
650
  }
651
  },
652
+ "sources":[
653
  {
654
+ "name":"UD Danish DDT v2.11",
655
+ "url":"https://github.com/UniversalDependencies/UD_Danish-DDT",
656
+ "license":"CC BY-SA 4.0",
657
+ "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
658
  },
659
  {
660
+ "name":"DaNE",
661
+ "url":"https://huggingface.co/datasets/dane",
662
+ "license":"CC BY-SA 4.0",
663
+ "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
664
  },
665
  {
666
+ "name":"DaCoref",
667
+ "url":"https://huggingface.co/datasets/alexandrainst/dacoref",
668
+ "license":"CC BY-SA 4.0",
669
+ "author":"Buch-Kromann, Matthias"
670
  },
671
  {
672
+ "name":"DaNED",
673
+ "url":"https://danlp-alexandra.readthedocs.io/en/stable/docs/datasets.html#daned",
674
+ "license":"CC BY-SA 4.0",
675
+ "author":"Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & S\u00f8gaard, A."
676
  },
677
  {
678
+ "name":"vesteinn/DanskBERT",
679
+ "author":"V\u00e9steinn Sn\u00e6bjarnarson",
680
+ "url":"https://huggingface.co/vesteinn/DanskBERT",
681
+ "license":"MIT"
682
  }
683
  ],
684
+ "notes":"\n\n### Training\nThis model was trained using [spaCy](https://spacy.io) and logged to [Weights & Biases](https://wandb.ai/kenevoldsen/dacy-v0.2.0). You can find all the training logs [here](https://wandb.ai/kenevoldsen/dacy-v0.2.0)."
685
  }