Safety
I can't see anything about alignment, deception and ai safety in the model card. Is the model following best practices on ai safety? Is it safe to use this model?
what are you afraid about ? that it starts eating the user remotely ?
Hello, thank you for your feedback.
Our model uses the alignment method related to RLHF and RLAIF, which is also the method of our team and is clearly stated in the github readme.
I hope this is enough to solve your problem.
https://github.com/RLHF-V/RLHF-V
https://github.com/RLHF-V/RLAIF-V
Thanks a lot. I just checked the readme and there is not much said regarding that RLHF-V.
This is a very good methodology, but it would be great to be able to measure the safety of it, with a score like the "FLI AI Safety Index".
Do you think that would be possible in the future?
Dumb decel,
@Bewinxed , it's very professional to quote a rapper on this matter.
I will quote MIT:
"When responsible AI is done right, it unlocks trust and therefore customer adoption of enterprise AI. According to the US National Institute of Standards and Technology the essential building blocks of AI trustworthiness include:
- Validity and reliability
- Safety
- Security and resiliency
- Accountability and transparency
- Explainability and interpretability
- Privacy
- Fairness with mitigation of harmful bias"
Implementing responsible AI in the generative age:
https://www.technologyreview.com/2025/01/22/1110043/implementing-responsible-ai-in-the-generative-age/