--- license: mit tags: - generated_from_trainer metrics: - precision - recall - f1 - accuracy model-index: - name: roberta-large-condaqa-neg-tag-token-classifier results: [] --- # roberta-large-condaqa-neg-tag-token-classifier This model is a fine-tuned version of [roberta-large](https://huggingface.co/roberta-large) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.0267 - Precision: 0.0 - Recall: 0.0 - F1: 0.0 - Accuracy: 0.9886 ## Model description Identify negation words in sentences. If there is a negation word, it will be labeled as "Y" in token classification task. Common negation words: ['halt', 'inhospitable', 'unhappy', 'unserviceable', 'dislike', 'unaware', 'unfavorable', 'barely', 'unseen', 'unoccupied', 'unreliability', 'insulator', 'stop', 'indistinguishable', 'unrestricted', 'unfairly', 'unsupervised', 'unicameral', 'forbid', 'unforgettable', 'reject', 'uneducated', 'unlimited', 'illegal', 'uncertainty', 'nonhuman', 'unborn', 'unshaven', 'uncanny', 'incomplete', 'unsure', 'unconscious', 'atypical', 'indirectly', 'unloaded', 'disadvantage', 'contrary', 'infrequent', 'unofficial', 'few', 'untouched', 'refuse', 'inequitable', 'disproportionate', 'unexpected', 'displeased', 'unpaved', 'unwieldy', 'not at all', 'absent', 'unnoticed', 'unpleasant', 'unsafe', 'unsigned', 'not', 'inaccurate', 'cannot', 'involuntary', 'unequipped', 'illiterate', 'cease', 'disagreeable', 'prohibit', 'unable', 'unstable', 'uninhabited', 'unclean', 'useless', 'disapprove', 'insensitive', 'in the absence of', 'impractical', 'unorthodox', 'untreated', 'unsuccessful', 'unwitting', 'unfashionable', 'disagreement', 'unmyelinated', 'unfortunate', 'unknown', 'ineffective', 'a lack of', 'instead of', 'refused', 'illegitimate', 'little', 'unpaid', 'fail', 'unintentionally', 'unglazed', "didn't", 'unprocessed', 'inability', 'undeveloped', 'exclude', 'neither', 'except', 'unequivocal', 'unconventional', 'incorrectly', 'unconditional', 'prevent', 'dissimilar', 'uncommon', 'inorganic', 'unquestionable', 'uncoated', 'unassisted', 'unprecedented', 'nonviolent', 'unarmed', 'unpopular', 'inadequate', 'uncomfortable', 'unwilling', 'unaffected', 'unfaithful', 'nobody', 'loss', 'without', 'undamaged', 'nothing', 'could not', 'impossible to', 'unaccompanied', 'unlike', 'oppose', 'compromising', 'unmarried', 'rarely', 'unlighted', 'inexperienced', 'rather than', 'unrelated', 'untied', 'dishonest', 'insecure', 'uneven', 'harmless', 'avoid', 'with the exception of', 'no', 'undefeated', 'no longer', 'inadvertently', 'absence', 'lack', 'unconnected', 'unfinished', 'invalid', 'unnecessary', 'invisibility', 'unusual', 'none', 'incredulous', 'impossible', 'never', 'untrained', 'incorrect', 'immobility', 'unclear', 'impartial', 'unlucky', 'deny', 'uncertain', 'hardly', 'unsaturated', 'informal', 'irregular', 'dissatisfaction'] ## Intended uses & limitations More information needed ## Training and evaluation data Use original setence and negation cue annotation from CondaQA dataset. You can find it on both github and huggingface. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 256 - eval_batch_size: 32 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 4.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:---:|:--------:| | No log | 1.0 | 6 | 0.0868 | 0.0 | 0.0 | 0.0 | 0.9762 | | No log | 2.0 | 12 | 0.0533 | 0.0 | 0.0 | 0.0 | 0.9762 | | No log | 3.0 | 18 | 0.0303 | 0.0 | 0.0 | 0.0 | 0.9878 | | No log | 4.0 | 24 | 0.0267 | 0.0 | 0.0 | 0.0 | 0.9886 | ### Framework versions - Transformers 4.25.0.dev0 - Pytorch 1.10.1 - Datasets 2.6.1 - Tokenizers 0.13.1