File size: 3,696 Bytes
747a559
 
 
 
 
 
c8bc735
 
 
724e84b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
license: mit
base_model:
- FacebookAI/roberta-large
pipeline_tag: token-classification
library_name: transformers
tags:
- LoRA
- Adapter
---

# Training 
This model adapter is designed for token classification tasks, enabling it to extract aspect terms and predict the sentiment polarity associated with the extracted aspect terms. 
The extracted aspect terms will be the span(s) from the input text on which a sentiment is being expressed. 
It has been created using [PEFT](https://huggingface.co/docs/peft/index) framework for [LoRA:Low-Rank Adaptation](https://arxiv.org/abs/2106.09685).

## Datasets
This model has been trained on the following datasets:

1. Aspect Based Sentiment Analysis SemEval Shared Tasks ([2014](https://aclanthology.org/S14-2004/), [2015](https://aclanthology.org/S15-2082/), [2016](https://aclanthology.org/S16-1002/))
2. Multi-Aspect Multi-Sentiment [MAMS](https://aclanthology.org/D19-1654/)

# Use

* Loading the base model and merging it with LoRA parameters

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
from peft import PeftModel

# preparing the labels
labels = {"B-neu": 1, "I-neu": 2, "O": 0, "B-neg": 3, "B-con": 4, "I-pos": 5, "B-pos": 6, "I-con": 7, "I-neg": 8, "X": -100}
id2labels = {k:lab for lab, k in labels.items()}
labels2ids = {k:lab for lab, k in id2labels.items()}

# loading tokenizer and base_model
base_id = 'FacebookAI/roberta-large'
tokenizer = AutoTokenizer.from_pretrained(base_id)
base_model = AutoModelForTokenClassification.from_pretrained(base_id, num_labels=len(labels), id2label=id2labels, label2id=labels2ids)

# using this adapter with base model 
model = PeftModel.from_pretrained(base_model, 'gauneg/roberta-large-absa-ate-sentiment-lora-adapter', is_trainable=False)

```

This model can be utilized in the following two methods:
1. Making token level inference
2. Using pipelines for end to end inference

## Making token level inference

```python
# after loading base model and the adapter as shown in the previous snippet

text_input = "Been here a few times and food has always been good but service really suffers when it gets crowded."
tok_inputs = tokenizer(text_input, return_tensors="pt").to(device)

y_pred = model(**tok_inputs) # predicting the logits

y_pred_fin = y_pred.logits.argmax(dim=-1)[0] # selecting the most favoured labels for each token from the logits

decoded_pred = [id2labels[logx.item()] for logx in y_pred_fin]

tok_levl_pred = list(zip(tokenizer.convert_ids_to_tokens(tok_inputs['input_ids'][0]), decoded_pred))[1:-1]
```

RESULTS in `tok_levl_pred` variable:
```bash
[('Be', 'O'),
 ('en', 'O'),
 ('Ġhere', 'O'),
 ('Ġa', 'O'),
 ('Ġfew', 'O'),
 ('Ġtimes', 'O'),
 ('Ġand', 'O'),
 ('Ġfood', 'B-pos'),
 ('Ġhas', 'O'),
 ('Ġalways', 'O'),
 ('Ġbeen', 'O'),
 ('Ġgood', 'O'),
 ('Ġbut', 'O'),
 ('Ġservice', 'B-neg'),
 ('Ġreally', 'O'),
 ('Ġsuffers', 'O'),
 ('Ġwhen', 'O'),
 ('Ġit', 'O'),
 ('Ġgets', 'O'),
 ('Ġcrowded', 'O'),
 ('.', 'O')]
```

## Using end-to-end token classification pipeline

```python
# after loading base model and the adapter as shown in the previous snippet
from transformers import pipeline

ate_senti_pipeline = pipeline(task='ner', 
                  aggregation_strategy='simple',
                  model=model,
                  tokenizer=tokenizer)


text_input = "Been here a few times and food has always been good but service really suffers when it gets crowded."
ate_senti_pipeline(text_input)

```
OUTPUT
```bash
[{'entity_group': 'pos',
  'score': 0.92310727,
  'word': ' food',
  'start': 26,
  'end': 30},
 {'entity_group': 'neg',
  'score': 0.90695626,
  'word': ' service',
  'start': 56,
  'end': 63}]
```