Skip to content
View XuandongZhao's full-sized avatar
😎
study
😎
study

Highlights

  • Pro
Block or Report

Block or report XuandongZhao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. WatermarkAttacker WatermarkAttacker Public

    Invisible Image Watermarks Are Provably Removable Using Generative AI

    Python 134 24

  2. Unigram-Watermark Unigram-Watermark Public

    [ICLR 2024] Provable Robust Watermarking for AI-Generated Text

    Python 20 5

  3. weak-to-strong weak-to-strong Public

    Weak-to-Strong Jailbreaking on Large Language Models

    Python 46 6

  4. pf-decoding pf-decoding Public

    Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs

    Python 7

  5. NPPrompt NPPrompt Public

    [ACL 2023] NPPrompt: Pre-trained Language Models Can be Fully Zero-Shot Learners

    Python 7 3

  6. DRW DRW Public

    [EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP

    Python 10 2