File size: 2,274 Bytes
b28b668 1d07bf4 5bd1f51 1d07bf4 5bd1f51 1d07bf4 b28b668 1d07bf4 14bd1a3 1d07bf4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
license: apache-2.0
base_model: facebook/bart-base
datasets:
- squad_v2
- drop
language:
- en
library_name: peft
tags:
- General purpose
- Text2text Generation
metrics:
- bertscore
- accuracy
- rouge
---
# Model Card
Base Model: facebook/bart-base
Fine-tuned : using PEFT-LoRa
Datasets : squad_v2, drop
Task: Generating questions from context and answers
Language: English
# Loading the model
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
HUGGING_FACE_USER_NAME = "mou3az"
model_name = "Question-Generation"
peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
QG_model = PeftModel.from_pretrained(model, peft_model_id)
```
# At inference time
```python
def get_question(context, answer):
device = next(QG_model.parameters()).device
input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
encoding = QG_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
output_tokens = QG_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
out = QG_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
return out
```
# Training parameters and hyperparameters
The following were used during training:
For Lora:
r=18
alpha=8
For training arguments:
gradient_accumulation_steps=16
per_device_train_batch_size=8
per_device_eval_batch_size=8
max_steps=3000
warmup_steps=75
weight_decay=0.05
learning_rate=1e-3
lr_scheduler_type="linear"
# Performance Metrics on Evaluation Set:
for 3000 optimization steps:
Training Loss: 1.292400
Evaluation Loss: 1.244928
Bertscore: 0.8123
Rouge: 0.532144
Fuzzywizzy similarity: 0.74209 |