denocris's picture
Update README.md
1137fa2
|
raw
history blame
900 Bytes
metadata
language: it
license: mit

ChefBERTo 👨‍🍳

chefberto-italian-cased is a BERT model obtained by MLM adaptive-tuning bert-base-italian-xxl-cased model on Italian cooking recipes, approximately 50k sentences (2.6M words).

Author: Cristiano De Nobili ([@denocris] on Twitter(https://twitter.com/denocris), LinkedIn) for VINHOOD.

Perplexity

Test set of 9k sentences about food.

Model Perplexity
chefberto-italian-cased 1.84
bert-base-italian-xxl-cased 2.85

Usage

from transformers import AutoModel, AutoTokenizer
model_name = "vinhood/chefberto-italian-cased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)