Input shape and multi-task learning
- opened
I am trying to recreate the results from the Kennedy paper as follows:
import datasets, transformers
from huggingface_hub import from_pretrained_keras
from tensorflow.keras.preprocessing.sequence import pad_sequences
dataset = datasets.load_dataset('ucberkeley-dlab/measuring-hate-speech', 'default', split='train')
df = dataset.to_pandas()
# unique texts for brevity
x = df['text'].unique()
model = from_pretrained_keras("ucberkeley-dlab/hate-measure-roberta-base")
tokenizer = transformers.RobertaTokenizer.from_pretrained("roberta-base")
tokens = tokenizer(x.tolist(), return_tensors='np', padding=True)
# padding as token shape is (None, 205), input shape is (None, 247)
padded_ids = pad_sequences(tokens['input_ids'], maxlen=247, padding='post', truncating='post')
padded_mask = pad_sequences(tokens['attention_mask'], maxlen=247, padding='post', truncating='post')
padded_inputs = [padded_ids, padded_mask]
y_pred = model.predict(padded_inputs)
I have a couple of questions:
- Why is the input shape different to the token shape? Are the extra 42 parameters related to the annotator severity, target-annotator demographics, or other metadata? If so, is it appropriate to pad the inputs as above?
- I take it this model predicts the HS score directly (Figure 4), but the paper describes a multi-task network followed by an IRT model (Figure 3). Is the multi-task version available?
Many thanks, and great work!
Charlie Lonergan
changed discussion title from
Multi-task component
to Input shape and multi-task learning