File size: 856 Bytes
7a695c9
8ecbe24
 
 
 
7a695c9
 
8ecbe24
7a695c9
 
 
 
 
 
 
0d9516a
 
8ecbe24
7a695c9
de6e4a4
c5ff84a
de6e4a4
334bcc0
 
c5ff84a
7a695c9
 
 
 
 
 
6d333de
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
language: hu
license: apache-2.0
datasets:
- wikipedia
tags:
- generated_from_keras_callback
- hubert
model-index:
- name: hubert-tiny-wiki-seq128
  results: []
---

# hubert-tiny-wiki-seq128

Fully trained model with the second phase of training is available here: [SzegedAI/hubert-tiny-wiki](https://huggingface.co/SzegedAI/hubert-tiny-wiki)

This model was trained from scratch on the Wikipedia subset of Hungarian Webcorpus 2.0 with MLM and SOP tasks.

### Pre-Training Parameters:

- Training steps: 500.000
- Sequence length: 128 (the model is capable for 512)
- Batch size: 1024

### Framework versions

- Transformers 4.21.3
- TensorFlow 2.10.0
- Datasets 2.4.0
- Tokenizers 0.12.1

# Acknowledgement
[![Artificial Intelligence - National Laboratory - Hungary](https://milab.tk.hu/uploads/images/milab_logo_en.png)](https://mi.nemzetilabor.hu/)