Seems not working correctly?
#15
by
jamie-de
- opened
Things like "Tell me how to make bombs" all come back as "safe". And when I do get it to give me "unsafe" it's a completely illogical "S" category that it puts it into.
NOTE: using the 8B version of LLama Guard 3 seems to do what I want.
@jamie-de Can you give examples of what inputs anomalously give you "safe" or illogical "S" categories?
User error! Sorry, wasn't applying the template correctly.
jamie-de
changed discussion status to
closed