Code required to train lg floret embeddings for Latin on LatinCy Assets data. Based on spaCy project Train floret vectors from Wikipedia and OSCAR.

Feature Description
Name la_vectors_floret_lg
Version 3.8.0
spaCy >=3.8.3,<3.9.0
Default Pipeline
Components
Vectors -1 keys, 200000 unique vectors (300 dimensions)
Sources UD_Latin-Perseus
UD_Latin-PROIEL
UD_Latin-ITTB
UD_Latin-LLCT
UD_Latin-UDante
Wikipedia
OSCAR
Corpus Thomisticum
The Latin Library
CLTK-Tesserae Latin
Patrologia Latina
License MIT
Author Patrick J. Burns
Downloads last month
19
Inference API
Unable to determine this model’s pipeline type. Check the docs .