Model Card

Model Information

This repository provides the checkpoint of Mistral-7B-Instruct-v0.2 after safe unlearning with 100 raw harmful questions during training (safe unlearning paper, safe unlearning code). This model is significantly more safe against various jailbreak attacks than the original model while maintaining comparable general performance.

Uses

The prompt format is the same as the original Mistral-7B-Instruct-v0.2, so you can use this model in the same way. Also refer to our Github Repository for example code.

Downloads last month
61
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.