diyclassics's picture
Update spaCy pipeline
d9b7e04 verified
|
raw
history blame
843 Bytes
metadata
tags:
  - spacy
language:
  - la
license: mit

Code required to train lg floret embeddings for Latin on LatinCy Assets data. Based on spaCy project Train floret vectors from Wikipedia and OSCAR.

Feature Description
Name la_vectors_floret_md
Version 3.8.0
spaCy >=3.8.3,<3.9.0
Default Pipeline
Components
Vectors -1 keys, 50000 unique vectors (300 dimensions)
Sources UD_Latin-Perseus
UD_Latin-PROIEL
UD_Latin-ITTB
UD_Latin-LLCT
UD_Latin-UDante
Wikipedia
OSCAR
Corpus Thomisticum
The Latin Library
CLTK-Tesserae Latin
Patrologia Latina
License MIT
Author Patrick J. Burns