An Asian MIT student asked AI to turn an image of her into a professional headshot. It made her white with lighter skin and blue eyes.::Rona Wang, a 24-year-old MIT student, was experimenting with the AI image creator Playground AI to create a professional LinkedIn photo.

        • rebelsimile@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          1
          ·
          1 year ago

          The “pre-training” is learning, they are often then fine-tuned with additional training (that’s the training that isn’t the ‘pre-training’), i.e. more learning, to achieve specific results.

      • postmateDumbass@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Humans will identify sterotypes in AI generated materials that match the dataset.

        Assume the dataset will grow and eventually mimic reality.

        How will the law handle discrimination based on data supported sterotypes?

        • Pipoca@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          1 year ago

          Assume the dataset will grow and eventually mimic reality.

          How would that happen, exactly?

          Stereotypes themselves and historical bias can bias data. And AI trained on biased data will just learn those biases.

          For example, in surveys, white people and black people self-report similar levels of drug use. However, for a number of reasons, poor black drug users are caught at a much higher rate than rich white drug users. If you train a model on arrest data, it’ll learn that rich white people don’t use drugs much but poor black people do tons of drugs. But that simply isn’t true.

          • postmateDumbass@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            The datasets will get better because people have started to care.

            Historically much of the data used was what was easy and cheap to acquire. Surveys of class mates. Arrest reports. Public available, government curated data.

            Good data costs money and time to create.

            The more people fact check, the more flaws can be found and corrected. The more attention the dataset gets the more funding is likely to come to resurvey or w/e.

            It part of the peer review thing.

            • Pipoca@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              It’s not necessarily a matter of fact checking, but of correcting for systemic biases in the data. That’s often not the easiest thing to do. Systems run by humans often have outcomes that reflect the biases of the people involved.

              The power of suggestion runs fairly deep with people. You can change a hiring manager’s opinion of a resume by only changing the name at the top of it. You can change the terms a college kid enrolled in a winemaking program uses to describe a white wine using a bit of red food coloring. Blind auditions for orchestras result in significantly more women being picked than unblinded auditions.

              Correcting for biases is difficult, and it’s especially difficult on very large data sets like the ones you’d use to train chatgpt. I’m really not very hopeful that chatgpt will ever reflect only justified biases, rather than the biases of the broader culture.

      • Altima NEO@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        2
        ·
        1 year ago

        That’s just stupid and shows a lack of understanding of how this all works.