WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
Abstract
The rapid advancement of generative models, facilitating the creation of hyper-realistic images from textual descriptions, has concurrently escalated critical societal concerns such as misinformation. Traditional fake detection mechanisms, although providing some mitigation, fall short in attributing responsibility for the malicious use of synthetic images. This paper introduces a novel approach to model fingerprinting that assigns responsibility for the generated images, thereby serving as a potential countermeasure to model misuse. Our method modifies generative models based on each user's unique digital fingerprint, imprinting a unique identifier onto the resultant content that can be traced back to the user. This approach, incorporating fine-tuning into Text-to-Image (T2I) tasks using the Stable Diffusion Model, demonstrates near-perfect attribution accuracy with a minimal impact on output quality. We rigorously scrutinize our method's secrecy under two distinct scenarios: one where a malicious user attempts to detect the fingerprint, and another where a user possesses a comprehensive understanding of our method. We also evaluate the robustness of our approach against various image post-processing manipulations typically executed by end-users. Through extensive evaluation of the Stable Diffusion models, our method presents a promising and novel avenue for accountable model distribution and responsible use.
Community
Proposes WOUAF (weighted modulation for user attribution and fingerprinting): imprint a unique identifier (ID) onto the resultant content generated by text-to-image diffusion models (without impacting output quality). Also tests cases of bad actor (who knows the workings, or attempts to find the identifier). A mapping network gives an intermediate representation of fingerprint (initially sampled from Bernoulli distribution, p = 0.5); weight modulation: an affine transformation layer remaps it to block-wise dimensions (self-attention and convolution layers); apply modulation to decoder (generating the image from latents) - apply to decoder only, quality drops if applied to SD U-Net. Train a ResNet-50 based decoder to decode fingerprints (identifications) from generated (provided) images; regularizing loss to prevent compromise in decoder output quality (perceptual distance). A robustness loss can be used (data augmentation like, but for different post-processing functions) instead of simple decoding (cross-entropy like) loss. Fine-tuned on MS-COCO, comparable to engineered SD-DWT (Discrete Wavelet Transform) and RivaGAN for user fingerprinting (identification/retrieval). Implementation details (weight-modulation in alignment with StyleGAN2-ADA), additional results (attribution accuracy, CLIP score for text and image similarity, FID), generation schedulers (Euler and DDIM), more results in appendix. From Intel, Arizona State University.
Links: PapersWithCode, website, HuggingFace Space, Related - C2PA
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper