Text Classification
Transformers
PyTorch
ONNX
Safetensors
English
deberta
Trained with AutoTrain

How to effectively use this?

#10
by geeek - opened

Sometimes the result seems all over the place. The number should be higher but comes less than 0.5. The logic to determine what do you suggest. Look at the top 3 and if one is ok then it means it is safe. The numbers I think are not like other models and have some gaps. It should be closer to one for harmful content it hardly goes above 0.5. I am not using it correctly. Any tips.

If I use the mode in js transformer it on returns the first item. Not all.

My plan was to just sort by worst scores, and flag content that is over a threshold (e.g. >0.5).

Been testing it a bit and I'm seeing varying results.
Sometimes it correctly identifies "S 0.6641" on a text that clearly implies something is about to go down. But then another text gets "S 0.0121" with all over low scores, despite being even more suggestive.

I've even seen some texts outright containing one the really bad 'r' words and really low scores around 0.25. These make me think that the flat threshold is perhaps not enough on its own.

In my flagging attempts, the OK score was also low. Perhaps a low "OK" score can also be a useful indicator that something is off about the content.

So maybe flag anything where the OK score isn't the biggest score? And perhaps also grab content that has any bad value above 0.2 when OK is below 0.6? (Not an expert on this, just throwing ideas)

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment