File size: 1,396 Bytes
dcb9011 fb35079 dcb9011 fb35079 dcb9011 fb35079 dcb9011 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
license: cc0-1.0
language:
- fi
pipeline_tag: text-classification
thumbnail: https://raw.githubusercontent.com/NatLibFi/FintoAI/main/ai.finto.fi/static/img/finto-ai-social.png
tags:
- glam
- lam
- subject indexing
- annif
---
# FintoAI-data-KAUNO
The Annif projects with KAUNO vocabulary used at the [Finto AI service](https://ai.finto.fi/).
This repository is mirrored from GitHub to the 🤗 Hugging Face Hub;
the GitHub repository does not contain the model files, but only the configurations for maintaining the projects, see below.
## Models
The data files for projects and vocabularies are stored in the
[`/data`](https://huggingface.co/juhoinkinen/FintoAI-data-KAUNO/tree/main/data)
directory of this repository in the 🤗 Hugging Face Hub.
## DVC pipeline
The projects are trained and evaluated using a [DVC (Data Version Control) pipeline](https://dvc.org/doc/start/data-management/data-pipelines) defined in [dvc.yaml](./dvc.yaml).
The pipeline takes care of
1. installing Annif in a venv,
2. loading the vocabulary,
3. training the projects,
4. evaluating the projects.
When the necessary vocabulary and training corpora are in place the pipeline can be run using the command
dvc repro
For more information about using DVC with Annif projects see the [DVC exercise of Annif tutorial](https://github.com/NatLibFi/Annif-tutorial/blob/master/exercises/OPT_dvc.md).
|