metadata

license: mit
pipeline_tag: fill-mask
tags:
  - code

Zero-shot text classification (base-sized model) trained with self-supervised tuning

Zero-shot text classification model trained with self-supervised tuning (SSTuning). It was introduced in the paper Zero-Shot Text Classification via Self-Supervised Tuning.

The model backbone is RoBERTa-base.

Model description

The model is tuned with unlabeled data using a learning objective called first sentence prediction (FSP). The FSP task is designed by considering both the nature of the unlabeled corpus and the input/output format of classification tasks. The training and validation sets are constructed from the unlabeled corpus using FSP. During tuning, BERT-like pre-trained masked language models such as RoBERTa and ALBERT are employed as the backbone, and an output layer for classification is added. The learning objective for FSP is to predict the index of the positive option. A cross-entropy loss is used for tuning the model.

Intended uses & limitations

The model can be used for zero-shot text classification such sentiment analysis and topic classificaion. No further finetuning is needed.

The number of labels should be 2 ~ 20.

How to use

You can try the model with the colab notebook.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch, string, random

tokenizer = AutoTokenizer.from_pretrained("DAMO-NLP-SG/zero-shot-classify-SSTuning-base")
model = AutoModelForSequenceClassification.from_pretrained("DAMO-NLP-SG/zero-shot-classify-SSTuning-base")

text = "I love this place! The food is always so fresh and delicious."

list_label = ["negative","positve"]

def add_prefix(text, list_label, label_num = 20, shuffle = False):
    list_label = [x+'.' if x[-1] != '.' else x for x in list_label]
    list_label_new = list_label + [tokenizer.pad_token]* (label_num - len(list_label))
    if shuffle: 
        random.shuffle(list_label_new)
    s_option = ' '.join(['('+list_ABC[i]+') '+list_label_new[i] for i in range(len(list_label_new))])
    return f'{s_option} {tokenizer.sep_token} {text}', list_label_new

text_new, list_label_new = add_prefix(text,list_label,shuffle=False)

ids = tokenizer.encode(text_new)
tokens = tokenizer.convert_ids_to_tokens(ids)
encoding = tokenizer([text],truncation=True, padding='max_length',max_length=512)
item = {key: torch.tensor(val) for key, val in encoding.items()}
logits = model(**item).logits
probs = torch.nn.functional.softmax(logits, dim = -1).tolist()
predictions = torch.argmax(logits, dim=-1)

BibTeX entry and citation info

@inproceedings{acl23/SSTuning,
  author    = {Chaoqun Liu and
               Wenxuan Zhang and
               Guizhen Chen and
               Xiaobao Wu and
               Anh Tuan Luu and
               Chip Hong Chang and 
               Lidong Bing},
  title     = {Zero-Shot Text Classification via Self-Supervised Tuning},
  booktitle={Findings of the 2023 ACL},
  year      = {2023},
  url       = {},
}