HorikawaMegu
/

JEC-mt5-small

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-small

This model was trained from scratch on TEC-JL Japanese learner error corpus dataset. It achieves the following results on the evaluation set:

Loss: 0.0758
Bleu: 67.2605
Gen Len: 13.051

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 12

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.0485	1.0	3125	0.7139	0.0061	13.051
0.2413	2.0	6250	0.1114	53.3974	13.056
0.1153	3.0	9375	0.0937	61.71	13.056
0.0918	4.0	12500	0.0867	63.8407	13.056
0.0819	5.0	15625	0.0833	65.2015	13.056
0.08	6.0	18750	0.0806	65.6513	13.056
0.078	7.0	21875	0.0793	66.3861	13.051
0.0704	8.0	25000	0.0779	66.6447	13.051
0.0724	9.0	28125	0.0759	67.2105	13.051
0.0707	10.0	31250	0.0765	67.3232	13.051
0.0682	11.0	34375	0.0761	67.3443	13.051
0.07	12.0	37500	0.0758	67.2605	13.051

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Downloads last month: 2

Safetensors

Model size

300M params

Tensor type

F32

·

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

Metadata error: specify a dataset to view leaderboard