The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models. Is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.

ARTICLE - Technology Review

ARTICLE - Mashable

ARTICLE - Gizmodo

The researchers tested the attack on Stable Diffusion’s latest models and on an AI model they trained themselves from scratch. When they fed Stable Diffusion just 50 poisoned images of dogs and then prompted it to create images of dogs itself, the output started looking weird—creatures with too many limbs and cartoonish faces. With 300 poisoned samples, an attacker can manipulate Stable Diffusion to generate images of dogs to look like cats.

  • possibly a cat@lemmy.ml
    link
    fedilink
    arrow-up
    10
    ·
    edit-2
    1 year ago

    lain?!

    I see data poisoning as a continuing trend. However this is one tricky battle.

    If you have a very signature digital art style, and it has not been published on the internet yet, and you poison every single piece of work you create, then you might have some success foiling copycats. If you are a physical artist, you would also have to prevent pictures of your works from ever being shared.

    As the Tech Review article mentions, and other commenters, the images first have to be included in training sets. And those training sets are huge. (I don’t think many people recognize small/bespoke models as a major threat at this point in time.) You might get some short-lived success when a new fad first comes out and the training sets are minimal.

    What would really be needed is the automatic inclusion of data poisoning in widely popular data/media sharing services, and then time for them to poison the data pool.

    However even that is fallible. I expect that based on the way that private property works, if data poisoning ever took off at scale then courts would rule it illegal (or at least civilly liable for damages). Also AI researchers are already looking into ways to overcome data poisoning, and their methods can be retroactively effective.

    Nonetheless I find data-poisoning to be an important and intriguing research field. There are valid use cases even if it never successfully extends artists’ control over their works.

    Edit: I don’t know if I was being dense but it took me awhile to find the study, so here’s the link - https://arxiv.org/abs/2310.13828v1