This is a very small version of BERT, intended for later fine-tune under URL analysis.

An updated version of the old basic model for URL analysis

Old version: https://huggingface.co/CrabInHoney/urlbert-tiny-base-v2

Model size

3.69M params

Tensor type

F32

Test example:

from transformers import BertTokenizerFast, BertForMaskedLM, pipeline
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Используемое устройство: {device}")

model_name = "CrabInHoney/urlbert-tiny-base-v3"

tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForMaskedLM.from_pretrained(model_name)
model.to(device)

fill_mask = pipeline(
    "fill-mask",
    model=model,
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1
)

sentences = [
    "http://example.[MASK]/"
]

for sentence in sentences:
    print(f"\nИсходное предложение: {sentence}")
    results = fill_mask(sentence)
    for result in results:
        token_str = result['token_str']
        score = result['score']
        print(f"Предсказанное слово: {token_str}, вероятность: {score:.4f}")
        

Output:

Исходное предложение: http://example.[MASK]/

Предсказанное слово: com, вероятность: 0.7018

Предсказанное слово: org, вероятность: 0.1191

Предсказанное слово: nl, вероятность: 0.0406

Предсказанное слово: net, вероятность: 0.0294

Предсказанное слово: ca, вероятность: 0.0190

Downloads last month
16
Safetensors
Model size
3.69M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for CrabInHoney/urlbert-tiny-base-v3

Finetunes
1 model

Collection including CrabInHoney/urlbert-tiny-base-v3