Google accused “commercially motivated” actors of trying to clone its Gemini AI after indiscriminately scraping the web for its models.

  • wonderingwanderer@sopuli.xyz
    link
    fedilink
    arrow-up
    1
    ·
    20 hours ago

    once developed, would also be a bit more efficient than current models

    That’s not how it works though. They’re not optimizing them for efficiency. The business model they’re following is “just a few billion more parameters this time, and it’ll gain sentiency for sure.”

    Which is ridiculous. AGI, even if it’s possible (which is doubtful), isn’t going to emerge from some highly advanced LLM.

    in the meantime, consumer-grade hardware is only getting better and more powerful

    There’s currently a shortage of DDR5 RAM because these AI companies are buying years-worth of industrial output capacity…

    Some companies are shifting away from producing consumer-grade GPUs in order to meet demand coming from commercial data centers.

    It’s likely we’re at the peak of conventional computing, at least in terms of consumer hardware.

    Why would you ask the uber-LLM to code you a new model that hasn’t been trained yet? Just ask it to give you one that already has all the training done and the weights figured out. Ask it to give you one that’s ready to go, right out of the box.

    That’s not something they’re capable of. They have a context window, and none of them has one large enough to output billions of generated parameters. It can give you a python script to generate a gaussian distribution with a given number of parameters, layers, hidden sizes, and attention heads, but it can’t make one that’s already pre-trained.

    Also, their NLP is designed to parse texts, even code, but they already struggle with mathematics. There’s no way it could generate a viable weight distribution, even if it had a 12 billion token context window, because they’re not designed to predict that.

    You’d have to run a script to get an untrained model, and then pre-train it yourself. Or you can download a pre-trained model and fine-tune it yourself, or use as is.