diyclassics's picture
Update spaCy pipeline
bc4112a verified
---
tags:
- spacy
language:
- la
license: mit
---
Code required to train lg floret embeddings for Latin on LatinCy Assets data. Based on spaCy project [Train floret vectors from Wikipedia and OSCAR](https://github.com/explosion/projects/tree/v3/pipelines/floret_wiki_oscar_vectors).
| Feature | Description |
| --- | --- |
| **Name** | `la_vectors_floret_lg` |
| **Version** | `3.8.0` |
| **spaCy** | `>=3.8.3,<3.9.0` |
| **Default Pipeline** | |
| **Components** | |
| **Vectors** | -1 keys, 200000 unique vectors (300 dimensions) |
| **Sources** | UD_Latin-Perseus<br>UD_Latin-PROIEL<br>UD_Latin-ITTB<br>UD_Latin-LLCT<br>UD_Latin-UDante<br>Wikipedia<br>OSCAR<br>Corpus Thomisticum<br>The Latin Library<br>CLTK-Tesserae Latin<br>Patrologia Latina |
| **License** | `MIT` |
| **Author** | [Patrick J. Burns](https://diyclassics.github.io/) |