tsmatz commited on
Commit
0bcddb6
·
1 Parent(s): ee323a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -10
README.md CHANGED
@@ -21,7 +21,13 @@ should probably proofread and complete it, then remove this comment. -->
21
 
22
  # mt5_summarize_japanese
23
 
24
- This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
 
 
 
 
 
 
25
  It achieves the following results on the evaluation set:
26
  - Loss: 1.8952
27
  - Rouge1: 0.4625
@@ -29,20 +35,21 @@ It achieves the following results on the evaluation set:
29
  - Rougel: 0.3656
30
  - Rougelsum: 0.3868
31
 
32
- ## Model description
33
-
34
- More information needed
35
 
36
- ## Intended uses & limitations
 
37
 
38
- More information needed
39
-
40
- ## Training and evaluation data
41
-
42
- More information needed
43
 
44
  ## Training procedure
45
 
 
 
46
  ### Training hyperparameters
47
 
48
  The following hyperparameters were used during training:
 
21
 
22
  # mt5_summarize_japanese
23
 
24
+ (Japanese caption : 日本語の要約のモデル)
25
+
26
+ This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) trained for Japanese summarization.
27
+
28
+ This model is trained on BBC news articles ([XL-Sum Japanese dataset](https://huggingface.co/datasets/csebuetnlp/xlsum/viewer/japanese)), in which the first sentence (headline sentence) is used for summary and others are used for articles.<br>
29
+ So **please fill news story (including, such as, event, background, result, and comment) as source text in the inferece widget**. (Other corpra - such as, business document, book reading, or short tale - are not seen in training set.)
30
+
31
  It achieves the following results on the evaluation set:
32
  - Loss: 1.8952
33
  - Rouge1: 0.4625
 
35
  - Rougel: 0.3656
36
  - Rougelsum: 0.3868
37
 
38
+ ## Intended uses
 
 
39
 
40
+ ```python
41
+ from transformers import pipeline
42
 
43
+ seq2seq = pipeline("summarization", model="tsmatz/mt5-summarize-jp")
44
+ sample_text = "サッカーのワールドカップカタール大会、世界ランキング24位でグループEに属する日本は、23日の1次リーグ初戦において、世界11位で過去4回の優勝を誇るドイツと対戦しました。試合は前半、ドイツの一方的なペースではじまりましたが、後半、日本の森保監督は攻撃的な選手を積極的に動員して流れを変えました。結局、日本は前半に1点を奪われましたが、途中出場の堂安律選手と浅野拓磨選手が後半にゴールを決め、2対1で逆転勝ちしました。ゲームの流れをつかんだ森保采配が功を奏しました。"
45
+ result = seq2seq(sample_text)
46
+ print(result)
47
+ ```
48
 
49
  ## Training procedure
50
 
51
+ You can download the source code for fine-tuning from [here](https://github.com/tsmatz/huggingface-finetune-japanese/blob/master/02-summarize.ipynb).
52
+
53
  ### Training hyperparameters
54
 
55
  The following hyperparameters were used during training: