UNIST-Eunchan commited on
Commit
77470a8
1 Parent(s): bf01a04

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -27
README.md CHANGED
@@ -268,40 +268,56 @@ should probably proofread and complete it, then remove this comment. -->
268
 
269
  # FLAN-T5-NLP-Paper-to-Question-Generation
270
 
271
- This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
272
- It achieves the following results on the evaluation set:
273
- - Loss: 0.4504
274
 
275
- ## Model description
276
 
277
- More information needed
278
 
279
- ## Intended uses & limitations
280
 
281
- More information needed
282
 
283
- ## Training and evaluation data
 
 
284
 
285
- More information needed
 
 
286
 
287
- ## Training procedure
 
 
 
 
 
 
288
 
289
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
290
 
291
- The following hyperparameters were used during training:
292
- - learning_rate: 0.0001
293
- - train_batch_size: 1
294
- - eval_batch_size: 1
295
- - seed: 42
296
- - gradient_accumulation_steps: 16
297
- - total_train_batch_size: 16
298
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
299
- - lr_scheduler_type: linear
300
- - lr_scheduler_warmup_steps: 184
301
- - num_epochs: 10
302
 
303
  ### Training results
304
 
 
 
 
 
305
  | Training Loss | Epoch | Step | Validation Loss |
306
  |:-------------:|:-----:|:----:|:---------------:|
307
  | No log | 0.99 | 46 | 34.6109 |
@@ -315,10 +331,31 @@ The following hyperparameters were used during training:
315
  | 0.4811 | 8.94 | 414 | 0.4505 |
316
  | 0.4721 | 9.93 | 460 | 0.4504 |
317
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
318
 
319
- ### Framework versions
 
 
 
 
 
 
 
 
 
 
320
 
321
- - Transformers 4.35.2
322
- - Pytorch 2.1.0+cu118
323
- - Datasets 2.15.0
324
- - Tokenizers 0.15.0
 
268
 
269
  # FLAN-T5-NLP-Paper-to-Question-Generation
270
 
271
+ This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an [allenai/QASPER: a dataset for question answering on scientific research papers ](https://huggingface.co/datasets/allenai/qasper)-based [NLP-Paper-to-QA-Generation](https://huggingface.co/datasets/UNIST-Eunchan/NLP-Paper-to-QA-Generation) dataset.
 
 
272
 
 
273
 
 
274
 
275
+ ## How to Use ( Code Snippets )
276
 
 
277
 
278
+ ### # Load model directly
279
+ ```(python)
280
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
281
 
282
+ tokenizer = AutoTokenizer.from_pretrained("UNIST-Eunchan/FLAN-T5-NLP-Paper-to-Question-Generation")
283
+ model = AutoModelForSeq2SeqLM.from_pretrained("UNIST-Eunchan/FLAN-T5-NLP-Paper-to-Question-Generation")
284
+ ```
285
 
286
+ ### Prompting Input
287
+ ```(python)
288
+ txt = r"""
289
+ Generate Question, Answer pair correspond to the following research paper.
290
+ [Abstract] + {text['abstract']} + [Introduction] + {text['introduction']}
291
+ Question, Answer:
292
+ """.replace("\n", "")
293
 
294
+ inputs = tokenizer(txt, max_length = 1024, truncation=True, padding="max_length", return_tensors="pt")
295
+ ```
296
+
297
+ ### For Multiple Question Generation (👍)
298
+ ```(python)
299
+ summaries = model.generate(input_ids =inputs["input_ids"], max_new_tokens=100, do_sample = True, top_p = 0.95, num_return_sequences = 4)
300
+ ```
301
+ ### For Single Question Generation
302
+ ```(python)
303
+ summaries = model.generate(input_ids =inputs["input_ids"], max_new_tokens=100, do_sample = True, top_p = 0.95)
304
+ ```
305
+
306
+
307
+
308
+ ```
309
+ decoded_summaries = [tokenizer.decode(s, skip_special_tokens=False, clean_up_tokenization_spaces=True) for s in summaries]
310
+ decoded_summaries = [d.replace("<n>", " ").replace(tokenizer.pad_token, "").replace(tokenizer.eos_token, "") for d in decoded_summaries]
311
+
312
+ ```
313
 
 
 
 
 
 
 
 
 
 
 
 
314
 
315
  ### Training results
316
 
317
+
318
+ It achieves the following results on the evaluation set:
319
+ - Loss: 0.4504
320
+
321
  | Training Loss | Epoch | Step | Validation Loss |
322
  |:-------------:|:-----:|:----:|:---------------:|
323
  | No log | 0.99 | 46 | 34.6109 |
 
331
  | 0.4811 | 8.94 | 414 | 0.4505 |
332
  | 0.4721 | 9.93 | 460 | 0.4504 |
333
 
334
+ ## Model description
335
+
336
+ - FLAN-T5-Large (770M)
337
+
338
+ ## Intended uses & limitations
339
+
340
+ - NLP Paper's Abstract + Introduction --> {Question} [SEP] {Answer}
341
+
342
+
343
+ ## Training and evaluation data
344
+ - Used Dataset: [UNIST-Eunchan/NLP-Paper-to-QA-Generation](https://huggingface.co/datasets/UNIST-Eunchan/NLP-Paper-to-QA-Generation) dataset.
345
+ - Train: dataset['train'] + dataset['test']
346
+ - Evaluation: dataset['validation']
347
+
348
+ ### Training hyperparameters
349
 
350
+ The following hyperparameters were used during training:
351
+ - learning_rate: 0.0001
352
+ - train_batch_size: 1
353
+ - eval_batch_size: 1
354
+ - seed: 42
355
+ - gradient_accumulation_steps: 16
356
+ - total_train_batch_size: 16
357
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
358
+ - lr_scheduler_type: linear
359
+ - lr_scheduler_warmup_steps: 184
360
+ - num_epochs: 10
361