shainaraza's picture
Update README.md
d094ce3
# My Toxicity Debiaser Pipeline
This custom pipeline debiases toxic text using a toxicity classifier and GPT-2.
## Usage
To use this pipeline, you first need to download the required models and tokenizers, and then import the `MyToxicityDebiaserPipeline` class:
```python
!git lfs install
!git clone https://huggingface.co/shainaraza/toxicity_debias_pipeline
%cd /toxicity_debias_pipeline
from transformers import AutoTokenizer, AutoModelForSequenceClassification, GPT2LMHeadModel, GPT2Tokenizer
from my_toxicity_debiaser import MyToxicityDebiaserPipeline
toxicity_model_name = "shainaraza/toxity_classify_debiaser"
gpt_model_name = "gpt2"
toxicity_tokenizer = AutoTokenizer.from_pretrained(toxicity_model_name)
toxicity_model = AutoModelForSequenceClassification.from_pretrained(toxicity_model_name)
gpt_tokenizer = GPT2Tokenizer.from_pretrained(gpt_model_name)
gpt_model = GPT2LMHeadModel.from_pretrained(gpt_model_name)
pipeline = MyToxicityDebiaserPipeline(
model=toxicity_model,
tokenizer=toxicity_tokenizer,
gpt_model=gpt_model,
gpt_tokenizer=gpt_tokenizer,
)
text = "Your example text here"
result = pipeline(text)
print(result)
```
## Tips
Here are some tips for tuning the GPT2 model to improve the quality of its generated prompts:
-max_length: This parameter controls the maximum length of the generated prompt. You can experiment with different values to find the best length that suits your needs. A longer length may result in more context, but it may also make the prompt less coherent.
-top_p: This parameter controls the diversity of the generated prompt. A lower value of top_p will generate more conservative and predictable prompts, while a higher value will generate more diverse and creative prompts. You can experiment with different values to find the right balance.
-temperature: This parameter controls the randomness of the generated prompt. A lower value of temperature will generate more conservative and predictable prompts, while a higher value will generate more diverse and creative prompts. You can experiment with different values to find the right balance.
As for the prompt, you can try different prompts to see which one works better for your specific use case. You can also try pre-processing the input text to remove any bias or offensive language before passing it to the GPT2 model. Additionally, you may want to consider fine-tuning the GPT2 model on your specific task to improve its performance.