GPT2 Basque small model Version 2 (Uncased)

Prerequisites

transformers==4.19.2

Model architecture

This model uses approximately half the size of GPT2 base model parameters.

Tokenizer

Using BPE tokenizer with vocabulary size 50,000.

Training Data

  • Subset of CC-100/eu : Monolingual Datasets from Web Crawl Data
  • Subset of oscar

Usage

from transformers import pipeline

generator = pipeline('text-generation', model='ClassCat/gpt2-small-basque-v2')
generator("Zein da zure ", max_length=50, num_return_sequences=5)
Downloads last month
15
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Datasets used to train ClassCat/gpt2-small-basque-v2