metadata
license: mit
datasets:
- squad
language:
- en
pipeline_tag: text-classification
widget:
- text: 'question: What number comes after five? answer: four'
- text: 'question: Which person is associated with Kanye West? answer: a tree'
- text: 'question: When is US independence day from aliens? answer: 7/4/1996'
kgourgou/bert-base-uncased-QA-classification
An experiment into classifying whether a pair of (question, answer) is valid. This is not a very good model at this point, but eventually such a model could help with RAG. For a stronger model, check this one by vectara.
Input must be formatted as
question: {your query}? answer: {your possible answer}
The output probabilities are for
- class 0 = the answer string couldn't be an answer to the question and
- class 1 = the answer string could be an answer to the question.
"Could be" should be interpreted as a type match, e.g., if the question requires the answer to be a person or a number or a date.
Examples:
- "question: What number comes after five? answer: four" → this should be class 1 as the answer is a number (even if it's not the right number).
- "question: Which person is associated with Kanye West? answer: a tree" → this should be class 0 as a tree is not a person.
Base model details
The base model is bert-base-uncased. For this experiment, I only use the "squad" dataset after preprocessing it to bring it to the required format.