Image-generating AI models like Midjourney, Stable Diffusion, and DALL-E have been trained on images scraped from the internet. Nightshade is a tool artists can use to fight back against this unauthorized use of their work.
Artists are suing AI companies like OpenAI, Google, Microsoft, and Stability AI for how they trained their AI models. These artists believe data scraping their copyrighted material without consent or compensation should be illegal—but what recourse do artists have to prevent their work from being taken in the first place?
A University of Chicago professor has developed a tool to give artists a way to fight back against large-scale data scraping. The tool is called Nightshade and it works by ‘poisoning’ the data to AI, making it impossible for machines to tell the difference between these edits. They’re invisible to the human eye—which means art can still be shared to be enjoyed. Along with mislabeled metadata, the poisoning attack can impact a generative AI’s model with incorrect results.
So how does Nightshade work?
“A big company just takes your data and there’s nothing artists can really do. OK. So, how can we help?” posits Shawn Shan, a graduate researcher at the University of Chicago and author of a paper detailing how prompt poisoning works. “If you take my data, that’s fine. I can’t stop that, but I’ll inject a certain type of malicious or crafted data, so you will poison or it will damage your model if you take my data.”
“We designed it in such a way that it is very hard to separate what is bad data and what is good data from artists’ websites. So this can really give some incentives to both companies and artists just to work together on this thing, right? Rather than a company taking everything from artists because they can.”
Part of this works just by editing in the wrong metadata. So for example, including enough images of cats with metadata describing them as dogs. That example is easy enough to counter, which is what makes Nightshade ingenious. Nightshade takes an image of a cat and attempts to make it look like a dog to the machine—creating a dataset of poisoned images in the process.
“So you know, it’s possible for them to filter them out and say, OK, these are malicious data, let’s not train on them. In some sense, we also win in these cases because they remove the data we don’t want them to train on, right?”