Edit model card

Model Card for KartonBERT_base_cased_v1

This is a classic Polish BERT model, trained with MLM task. It comes with a custom ~23k-tokens BWPT tokenizer. While not ideal, it performs well on certain downstream tasks and serves as a checkpoint in my work.

Model Description

How to use model for fill-mask task

Use the code below to get started with the model.

from transformers import pipeline

tokenizer_kwargs={'truncation': True, 'max_length': 512}
model = pipeline('fill-mask', model='OrlikB/KartonBERT_base_uncased_v1', tokenizer_kwargs=tokenizer_kwargs)

model("Kartony to inaczej [MASK], które produkowane są z tektury.")

# Output
[{'score': 0.12927177548408508,
  'token': 5324,
  'token_str': 'materiały',
  'sequence': 'kartony to inaczej materiały, które produkowane są z tektury.'},
 {'score': 0.0821441262960434,
  'token': 2403,
  'token_str': 'produkty',
  'sequence': 'kartony to inaczej produkty, które produkowane są z tektury.'},
 {'score': 0.06760794669389725,
  'token': 392,
  'token_str': 'te',
  'sequence': 'kartony to inaczej te, które produkowane są z tektury.'},
 {'score': 0.06753358244895935,
  'token': 20289,
  'token_str': 'pudełka',
  'sequence': 'kartony to inaczej pudełka, które produkowane są z tektury.'},
 {'score': 0.04844100773334503,
  'token': 16715,
  'token_str': 'wyroby',
  'sequence': 'kartony to inaczej wyroby, które produkowane są z tektury.'}]
Downloads last month
15
Safetensors
Model size
104M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.