SubjECTiveQA-CAUTIOUS Model

Model Name: SubjECTiveQA-CAUTIOUS

Model Type: Text Classification

Language: English

Base Model: google-bert/bert-base-uncased

Dataset Used for Training: gtfintechlab/SubjECTive-QA

Model Overview

SubjECTiveQA-CAUTIOUS is a fine-tuned BERT-based model designed to classify text data according to the 'CAUTIOUS' attribute. The 'CAUTIOUS' attribute is one of several subjective attributes annotated in the SubjECTive-QA dataset, which focuses on subjective question-answer pairs in financial contexts.

Intended Use

This model is intended for researchers and practitioners working on subjective text classification, particularly within financial domains. It is specifically designed to assess the 'CAUTIOUS' attribute in question-answer pairs, aiding in the analysis of subjective content in financial communications.

How to Use

To utilize this model, you can load it using the Hugging Face transformers library:

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

# Load the tokenizer, model, and configuration
tokenizer = AutoTokenizer.from_pretrained("gtfintechlab/SubjECTiveQA-CAUTIOUS", do_lower_case=True, do_basic_tokenize=True)
model = AutoModelForSequenceClassification.from_pretrained("gtfintechlab/SubjECTiveQA-CAUTIOUS", num_labels=3)
config = AutoConfig.from_pretrained("gtfintechlab/SubjECTiveQA-CAUTIOUS")

# Initialize the text classification pipeline
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer, config=config, framework="pt")

# Classify the 'CAUTIOUS' attribute in your question-answer pairs
qa_pairs = [
    "Question: What are your company's projections for the next quarter? Answer: We anticipate a 10% increase in revenue due to the launch of our new product line.",
    "Question: Can you explain the recent decline in stock prices? Answer: Market fluctuations are normal, and we are confident in our long-term strategy."
]
results = classifier(qa_pairs, batch_size=128, truncation="only_first")

print(results)

In this script:

Tokenizer and Model Loading: The AutoTokenizer and AutoModelForSequenceClassification classes load the pre-trained tokenizer and model, respectively, from the gtfintechlab/SubjECTiveQA-CAUTIOUS repository.
Configuration: The AutoConfig class loads the model configuration, which includes parameters such as the number of labels.
Pipeline Initialization: The pipeline function initializes a text classification pipeline with the loaded model, tokenizer, and configuration.
Classification: The classifier processes a list of question-answer pairs to assess the 'CAUTIOUS' attribute. The batch_size parameter controls the number of samples processed simultaneously, and truncation="only_first" ensures that only the first sequence in each pair is truncated if it exceeds the model's maximum input length.

Ensure that your environment has the necessary dependencies installed.

Label Interpretation

LABEL_0: Negatively Demonstrative of 'CAUTIOUS' (0)
Indicates that the response lacks caution.
LABEL_1: Neutral Demonstration of 'CAUTIOUS' (1)
Indicates that the response has an average level of caution.
LABEL_2: Positively Demonstrative of 'CAUTIOUS' (2)
Indicates that the response is cautious and prudent.

Training Data

The model was trained on the SubjECTive-QA dataset, which comprises question-answer pairs from financial contexts, annotated with various subjective attributes, including 'CAUTIOUS'. The dataset is divided into training, validation, and test sets, facilitating robust model training and evaluation.

Citation

If you use this model in your research, please cite the SubjECTive-QA dataset:

@article{SubjECTiveQA,
  title={SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts’ QA Through Six-Dimensional Feature Analysis},
  author={Huzaifa Pardawala, Siddhant Sukhani, Agam Shah, Veer Kejriwal, Abhishek Pillai, Rohan Bhasin, Andrew DiBiasio, Tarun Mandapati, Dhruv Adha, Sudheer Chava},
  journal={arXiv preprint arXiv:2410.20651},
  year={2024}
}

For more details, refer to the SubjECTive-QA dataset documentation.

Contact

For any SubjECTive-QA related issues and questions, please contact:

Huzaifa Pardawala: huzaifahp7[at]gatech[dot]edu
Siddhant Sukhani: ssukhani3[at]gatech[dot]edu
Agam Shah: ashah482[at]gatech[dot]edu

gtfintechlab
/

SubjECTiveQA-CAUTIOUS