byroneverson
/

gemma-2-27b-it-abliterated

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gemma-2-27b-it-abliterated / README.md

byroneverson's picture

Update README.md

b936508 verified 4 months ago

|

1.13 kB

	---
	base_model: google/gemma-2-27b-it
	pipeline_tag: text-generation
	license: apache-2.0
	language:
	- en
	tags:
	- gemma
	- gemma-2
	- chat
	- it
	- abliterated
	library_name: transformers
	---


	NOTE: This is a current WIP (work in progress).

	Abliteration method:

	1. Obtain refusal direction with llama-cpp-python.
	2. Orthogonalization performed with torch directly to .safetensors. (one at a time)

	It is a rather larger model so it may take me another day or two to figure out which layer I should be using for the direction vector.

	First attempt: Layer 20 was used to obtain refusal direction vector. Refusal mitigation sort of worked but not perfect.

	Second attempt: (Current) Layer 23 was used (mid-point of model). Half-way has proven to work with other model so this should be fine.

	# gemma-2-27b-it-abliterated
	Check out the <a href="https://huggingface.co/byroneverson/gemma-2-27b-it-abliterated/blob/main/abliterate-gemma-2-27b-it.ipynb">jupyter notebook</a> for details of how this model was abliterated from glm-4-9b-chat.

	![Logo](https://huggingface.co/byroneverson/gemma-2-27b-it-abliterated/resolve/main/logo.png "Logo")