• PolarisFx@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            1
            ·
            9 months ago

            As someone who’s spent a couple weeks down a stable diffusion rabbit hole. I can attest that they don’t need to be trained on CP to generate CP content. Using some very popular checkpoints I inadvertently created some images that I found questionable enough to immediately delete. And I wasn’t even using prompts to generate young girls, with the right prompts I can easily see some of the more popular checkpoints pumping out CP.

          • SuddenlyBlowGreen@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            14
            ·
            9 months ago

            I think it it becomes widespread, like you want it to be, models that generate CSAM will be trained on such material, yes.

    • lolcatnip@reddthat.com
      link
      fedilink
      English
      arrow-up
      17
      arrow-down
      1
      ·
      9 months ago

      Not child porn. AI produces images all the time of things that aren’t in its training set. That’s kind of the point of it.

      • SuddenlyBlowGreen@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        17
        ·
        edit-2
        9 months ago

        AI produces images all the time of things that aren’t in its training set.

        AI models learn statistical connections from the data it’s provided. It’s going to see connections we can’t, but it’s not going to create things that are not connected to its training data. The closer the connection, the better the result.

        It’s a pretty easy conclusion from that that CSAM material will be used to train such models, and since training requires lots of data, and new data to create different and better models…

        • BetaDoggo_@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          9 months ago

          Real material is being used to train some models, but sugesting that it will encourage the creation of more “data” is silly. The amount required to finetune a model is tiny compared to the amount that is already known to exist. Just like how regular models haven’t driven people to create even more data to train on.

          • SuddenlyBlowGreen@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            9 months ago

            Just like how regular models haven’t driven people to create even more data to train on.

            It has driven companies to try to get access to more data people generate to train the models on.

            Like chatGPT on copyrighted books, or google on emails, docs, etc.

            • BetaDoggo_@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              9 months ago

              And what does that have to do with the production of csam? In the example given the data already existed, they’ve just been more aggressive about collecting it.

              • SuddenlyBlowGreen@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                9 months ago

                Well now in addition to regular pedos consuming CSAM, now there are the additional consumers of people to use huge datasets of them to train models.

                If there is an increase in demand, the supply will increase as well.

                • BetaDoggo_@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  9 months ago

                  Not necessarily. The same images would be consumed by both groups, there’s no need for new data. This is exactly what artists are afraid of. Image generation increases supply dramatically without increasing demand. The amount of data required is also pretty negligible. Maybe a few thousand images.