DANG VAN KIEU's picture
1

DANG VAN KIEU

kirudang

AI & ML interests

Machine Learning Engineer

Recent Activity

commented on an article about 1 month ago
Introducing SynthID Text
liked a model 10 months ago
AbeHou/SemStamp-c4-sbert
updated a Space over 1 year ago
kirudang/Chatbot
View all activity

Organizations

None yet

kirudang's activity

commented on Introducing SynthID Text about 1 month ago
view reply

Hello,

I applied the WM to LLama2 and used the availably trained detector named "joaogante/dummy_synthid_detector".
The output is return probability, not 1 (watermarked) or 0 (unwatermarked).
Could you help me with threshold and how to train the detector?

from transformers import (
    AutoTokenizer, BayesianDetectorModel, SynthIDTextWatermarkLogitsProcessor, SynthIDTextWatermarkDetector
)

# Load the detector. See examples/research_projects/synthid_text for training a detector.
detector_model = BayesianDetectorModel.from_pretrained("joaogante/dummy_synthid_detector")
logits_processor = SynthIDTextWatermarkLogitsProcessor(
    **detector_model.config.watermarking_config, device="cpu"
)
tokenizer = AutoTokenizer.from_pretrained(detector_model.config.model_name)
detector = SynthIDTextWatermarkDetector(detector_model, logits_processor, tokenizer)

# Test whether a certain string is watermarked
test_input = tokenizer(["This is a test input"], return_tensors="pt")
is_watermarked = detector(test_input.input_ids)
updated a Space over 1 year ago
updated a Space almost 2 years ago