• ikt@aussie.zone
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    2
    ·
    edit-2
    3 days ago

    Perhaps because you didn’t understand what they said.

    they repeat what i said, did you read them? previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that

    From your own article

    Is it impressive that DeepSeek-V3 cost half as much as Sonnet or 4o to train? I guess so. But OpenAI and Anthropic are not incentivized to save five million dollars on a training run, they’re incentivized to squeeze every bit of model quality they can. DeepSeek are obviously incentivized to save money because they don’t have anywhere near as much.

    https://www.seangoedecke.com/is-deepseek-fast/

    The revelations regarding its cost structure, GPU utilization, and innovative capabilities position DeepSeek as a formidable player.

    https://www.yahoo.com/news/research-exposes-deepseek-ai-training-165025904.html

    ^ fyi that article you linked to is an AI summary of a semianalysis.com article, maybe AI is useful after all ;)

    If the models were actually getting substantially more efficient, we wouldn’t be talking about bringing new nuclear reactors online just to run it.

    Youtube uses a fuck ton of power but is an incredibly efficient video delivery service

    The growth and popularity of AI and its uses is simply outpacing the efficiency gains

    • frezik@midwest.social
      link
      fedilink
      arrow-up
      1
      ·
      3 days ago

      they repeat what i said, did you read them? previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that

      Yes, and it says exactly what I claimed. DeepSeek is an improvement, but not to the level initially reported. Not even close.

      Youtube uses a fuck ton of power but is an incredibly efficient video delivery service

      What a colossally stupid thing to say. We’re not looking at starting up new nuclear reactors to run YouTube.

      • ikt@aussie.zone
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        2
        ·
        3 days ago

        DeepSeek is an improvement, but not to the level initially reported.

        🫠 I cannot be any clearer:

        previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that