shainaraza's picture
Update README.md
d094ce3
|
raw
history blame
2.47 kB

My Toxicity Debiaser Pipeline

This custom pipeline debiases toxic text using a toxicity classifier and GPT-2.

Usage

To use this pipeline, you first need to download the required models and tokenizers, and then import the MyToxicityDebiaserPipeline class:

!git lfs install
!git clone https://huggingface.co/shainaraza/toxicity_debias_pipeline

%cd /toxicity_debias_pipeline



from transformers import AutoTokenizer, AutoModelForSequenceClassification, GPT2LMHeadModel, GPT2Tokenizer
from my_toxicity_debiaser import MyToxicityDebiaserPipeline

toxicity_model_name = "shainaraza/toxity_classify_debiaser"
gpt_model_name = "gpt2"

toxicity_tokenizer = AutoTokenizer.from_pretrained(toxicity_model_name)
toxicity_model = AutoModelForSequenceClassification.from_pretrained(toxicity_model_name)

gpt_tokenizer = GPT2Tokenizer.from_pretrained(gpt_model_name)
gpt_model = GPT2LMHeadModel.from_pretrained(gpt_model_name)

pipeline = MyToxicityDebiaserPipeline(
    model=toxicity_model,
    tokenizer=toxicity_tokenizer,
    gpt_model=gpt_model,
    gpt_tokenizer=gpt_tokenizer,
)

text = "Your example text here"
result = pipeline(text)
print(result)

Tips

Here are some tips for tuning the GPT2 model to improve the quality of its generated prompts:

-max_length: This parameter controls the maximum length of the generated prompt. You can experiment with different values to find the best length that suits your needs. A longer length may result in more context, but it may also make the prompt less coherent.

-top_p: This parameter controls the diversity of the generated prompt. A lower value of top_p will generate more conservative and predictable prompts, while a higher value will generate more diverse and creative prompts. You can experiment with different values to find the right balance.

-temperature: This parameter controls the randomness of the generated prompt. A lower value of temperature will generate more conservative and predictable prompts, while a higher value will generate more diverse and creative prompts. You can experiment with different values to find the right balance.

As for the prompt, you can try different prompts to see which one works better for your specific use case. You can also try pre-processing the input text to remove any bias or offensive language before passing it to the GPT2 model. Additionally, you may want to consider fine-tuning the GPT2 model on your specific task to improve its performance.