Overall, when tested on 40 prompts, DeepSeek was found to have a similar energy efficiency to the Meta model, but DeepSeek tended to generate much longer responses and therefore was found to use 87% more energy.

  • misk@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    31
    arrow-down
    1
    ·
    2 days ago

    That’s kind of a weird benchmark. Wouldn’t you want a more detailed reply? How is quality measured? I thought the biggest technical feats here were ability to run reasonably well in a constrained memory settings and lower cost to train (and less energy used there).

    • wewbull@feddit.uk
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 day ago

      This is more about the “reasoning” aspect of the model where it outputs a bunch of “thinking” before the actual result. In a lot of cases it easily adds 2-3x onto the number of tokens needed to be generated. This isn’t really useful output. It the model getting into a state where it can better respond.

    • jacksilver@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 days ago

      Longer!=Detailed

      Generally what they’re calling out is that DeepSeek currently rambles more. With LLMs the challenge is how to get the right answer most sussinctly because each extra word is a lot of time/money.

      That being said, I suspect that really it’s all roughly the same. We’ve been seeing this back and forth with LLMs for a while and DeepSeek, while using a different approach, doesn’t really break the mold.

    • Aatube@kbin.melroy.orgOP
      link
      fedilink
      arrow-up
      8
      ·
      2 days ago

      The benchmark feels just like the referenced Jevons Paradox to me: Efficiency gains are eclipsed with a rise in consumption to produce more/better products.

    • Rhaedas@fedia.io
      link
      fedilink
      arrow-up
      5
      ·
      2 days ago

      More detailed and accurate reply is preferred, but length isn’t a quantifier for that. If anything that’s the problem with most LLMs, they tend to ramble a bit more than they need to, and it’s hard (at least with just prompting) to rein that in to narrow the answer to just the answer.