potsawee
/

t5-large-generation-race-QuestionAnswer

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

t5-large-generation-race-QuestionAnswer / README.md

potsawee's picture

Update README.md

828f94f almost 2 years ago

|

3.03 kB

	---
	license: apache-2.0
	datasets:
	- race
	language:
	- en
	library_name: transformers
	pipeline_tag: text2text-generation
	---
	# t5-large fine-tuned to RACE for Generating Question+Answer
	- Input: `context` (e.g. news article)
	- Output: `question <sep> answer`

	## Model Details

	t5-large model is fine-tuned to the RACE dataset where the input is the context/passage and the output is the question followed by the answer. This is the first component in the question generation pipeline (i.e. `g1`) in our [MQAG paper](https://arxiv.org/abs/2301.12307),
	or please refer to the GitHub repo of this project: https://github.com/potsawee/mqag0.

	## How to Use the Model

	Use the code below to get started with the model.

	```python
	>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	>>> tokenizer = AutoTokenizer.from_pretrained("potsawee/t5-large-generation-race-QuestionAnswer")
	>>> model = AutoModelForSeq2SeqLM.from_pretrained("potsawee/t5-large-generation-race-QuestionAnswer")

	>>> context = r"""
	... World number one Novak Djokovic says he is hoping for a "positive decision" to allow him
	... to play at Indian Wells and the Miami Open next month. The United States has extended
	... its requirement for international visitors to be vaccinated against Covid-19. Proof of vaccination
	... will be required to enter the country until at least 10 April, but the Serbian has previously
	... said he is unvaccinated. The 35-year-old has applied for special permission to enter the country.
	... Indian Wells and the Miami Open - two of the most prestigious tournaments on the tennis calendar
	... outside the Grand Slams - start on 6 and 20 March respectively. Djokovic says he will return to
	... the ATP tour in Dubai next week after claiming a record-extending 10th Australian Open title
	... and a record-equalling 22nd Grand Slam men's title last month.""".replace("\n", "")

	>>> inputs = tokenizer(context, return_tensors="pt")
	>>> outputs = model.generate(**inputs, max_length=100)
	>>> question_answer = tokenizer.decode(outputs[0], skip_special_tokens=False)
	>>> question_answer = question_answer.replace(tokenizer.pad_token, "").replace(tokenizer.eos_token, "")
	>>> question, answer = question_answer.split(tokenizer.sep_token)

	>>> print("question:", question)
	question: What is the best title for the passage?
	>>> print("answer:", answer)
	answer: Djokovic's application for special permission to enter the United States

	```

	## Generating Distractors (other options in a multiple-choice setup)

	```Context ---> Question + (A) Answer (B) Distractor1 (C) Distractor2 (D) Distractor3```

	Please refer to our distractor generation model: https://huggingface.co/potsawee/t5-large-generation-race-Distractor

	## Citation

	```bibtex
	@article{manakul2023mqag,
	title={MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization},
	author={Manakul, Potsawee and Liusie, Adian and Gales, Mark JF},
	journal={arXiv preprint arXiv:2301.12307},
	year={2023}
	}
	```