• andallthat@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    22 days ago

    Basically, model collapse happens when the training data no longer matches real-world data

    I’m more concerned about LLMs collaping the whole idea of “real-world”.

    I’m not a machine learning expert but I do get the basic concept of training a model and then evaluating its output against real data. But the whole thing rests on the idea that you have a model trained with relatively small samples of the real world and a big, clearly distinct “real world” to check the model’s performance.

    If LLMs have already ingested basically the entire information in the “real world” and their output is so pervasive that you can’t easily tell what’s true and what’s AI-generated slop “how do we train our models now” is not my main concern.

    As an example, take the judges who found made-up cases because lawyers used a LLM. What happens if made-up cases are referenced in several other places, including some legal textbooks used in Law Schools? Don’t they become part of the “real world”?

    • WanderingThoughts@europe.pub
      link
      fedilink
      English
      arrow-up
      0
      ·
      22 days ago

      LLM are not going to be the future. The tech companies know it and are working on reasoning models that can look up stuff to fact check themselves. These are slower, use more power and are still a work in progress.

      • andallthat@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        22 days ago

        Look up stuff where? Some things are verifiable more or less directly: the Moon is not 80% made of cheese,adding glue to pizza is not healthy, the average human hand does not have seven fingers. A “reasoning” model might do better with those than current LLMs.

        But for a lot of our knowledge, verifying means “I say X because here are two reputable sources that say X”. For that, having AI-generated text creeping up everywhere (including peer-reviewed scientific papers, that tend to be considered reputable) is blurring the line between truth and “hallucination” for both LLMs and humans

        • Aux@feddit.uk
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          22 days ago

          Who said that adding glue to pizza is not healthy? Meat glue is used in restaurants all the time!

  • kate@lemmy.uhhoh.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    22 days ago

    surely if they start to get worse we’d just use the models that already exist? didnt click the link though

    • Maestro@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      22 days ago

      If you do that then models won’t know any new information. For example, a model may think Biden still is president.

      • 3abas@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        22 days ago

        This is already a solved problem, we’re well past one model systems, and any competitive AI offering can augment its information from the Internet.

          • Aux@feddit.uk
            link
            fedilink
            English
            arrow-up
            0
            arrow-down
            1
            ·
            22 days ago

            The Internet was always full of mental diarrhea, if you can’t reason which content is correct and which is not, AI won’t change anything in your life.