thefrigidliquidation
/

roberta-base-pronouns

Inference Endpoints

Model card Files Files and versions Community

thefrigidliquidation commited on Apr 14

Commit

1b692d5

•

1 Parent(s): 0db3ac6

Add usage example to readme

Files changed (1) hide show

README.md +23 -24

README.md CHANGED Viewed

@@ -18,29 +18,28 @@ This model achieves an 88\% top1 accuracy, evaluated with a sliding window of 51
 ### How to use
-Mask *all* pronoun tokens. The use the fill mask pipeline to get the
-model's predictions.
 ```python
-PRONOUN_TOKENS = {
-    'I',                           'ĠI',
-    'you',    'You',    'Ġyou',    'ĠYou',
-    'he',     'He',     'Ġhe',     'ĠHe',
-    'she',    'She',    'Ġshe',    'ĠShe',
-    'it',     'It',     'Ġit',     'ĠIt',
-    'we',     'We',     'Ġwe',     'ĠWe',
-    'they',   'They',   'Ġthey',   'ĠThey',
-    'my',     'My',     'Ġmy',     'ĠMy',
-    'your',   'Your',   'Ġyour',   'ĠYour',
-    'his',    'His',    'Ġhis',    'ĠHis',
-    'her',    'Her',    'Ġher',    'ĠHer',
-    'its',    'Its',    'Ġits',    'ĠIts',
-    'our',    'Our',    'Ġour',    'ĠOur',
-    'their',  'Their',  'Ġtheir',  'ĠTheir',
-    'mine',   'Mine',   'Ġmine',   'ĠMine',
-    'yours',  'Yours',  'Ġyours',  'ĠYours',
-    'hers',   'Hers',   'Ġhers',   'ĠHers',
-    'ours',   'Ours',   'Ġours',   'ĠOurs',
-    'theirs', 'Theirs', 'Ġtheirs', 'ĠTheirs',
-}
-```

 ### How to use
+Use `fix_pronouns_in_text` from [pronoun_fixer.py](https://huggingface.co/thefrigidliquidation/roberta-base-pronouns/blob/main/pronoun_fixer.py)
 ```python
+from transformers import AutoModelForMaskedLM, AutoTokenizer, FillMaskPipeline
+import pronoun_fixer
+# text produced by sentence level machine translation where the pronoun was ambiguous in the source language
+# and is wrong in the target language
+MTL_TEXT = """
+Cadence Lee thought he was a normal girl, perhaps a little well to do, but not exceptionally so.
+"""
+device = 'cuda'
+pronoun_checkpoint = "thefrigidliquidation/roberta-base-pronouns"
+pronoun_model = AutoModelForMaskedLM.from_pretrained(pronoun_checkpoint).to(device)
+pronoun_tokenizer = AutoTokenizer.from_pretrained(pronoun_checkpoint)
+unmasker = FillMaskPipeline(model=pronoun_model, tokenizer=pronoun_tokenizer, device=device, top_k=10)
+fixed_text = pronoun_fixer.fix_pronouns_in_text(unmasker, pronoun_tokenizer, MTL_TEXT)
+print(fixed_text)
+# Cadence Lee thought she was a normal girl, perhaps a little well to do, but not exceptionally so.
+# now the pronoun is fixed
+```