File size: 366 Bytes
d340acb
 
 
 
 
 
 
 
 
c03fb1f
d340acb
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
---
datasets:
- AILaborant/crazy_tg
- AILaborant/crazy_tg_tiny
language:
- ru
pipeline_tag: text2text-generation
---
A small lm. (Russian only)
Created to emulate a really simple one way dialogue;
WARNING!!! CAN SWEAR!
It was trained on two T4s from scratch. Final training time: 1 hour 2 minutes.
The model consists of 3 transformer blocks stacked forming 6 layers.