SciShow Is Lying to You about AI. Here are the receipts.

dumnezero@piefed.social · 2 months ago

SciShow Is Lying to You about AI. Here are the receipts.

magic_lobster_party@fedia.io · 2 months ago

We know how each individual part work. That’s just basic math.

We don’t know for sure how all trillion parts together produce the results they do. You can’t debug the model step by step to see how the prompt ”generate image of a penguin” produces an image of a penguin, and not an ice bear. That what people mean with ”we don’t know how AI works”.

very_well_lost@lemmy.world · 2 months ago

Okay, but who cares? “Complex systems are difficult to predict” is a mathematical insight that’s like 2 centuries old at this point… and it hasn’t hindered us at all from gaining deep insights into how both individual complex systems work and how complex systems as a general class of phenomena work. I can’t keep track of all the masses and velocities of every individual air molecule in the room I’m sitting in, but I still know how the interactions of those particles give rise to the temperature and air pressure and general behavior of the atmosphere in the room.

People know how this shit works, and anyone telling you otherwise is either willfully ignorant or internationally lying to you to feed a hype cycle with an end goal of making your life worse. People can’t afford to remain uneducated about this stuff anymore.

magic_lobster_party@fedia.io · 2 months ago

What’s interesting is how these complex models produce anything useful at all. We could very well have complex models that don’t produce anything other than random noise.

Prunebutt@slrpnk.net · 2 months ago

The reason why “we” have these models because they were deliberately trained not to output random noise. That part is well understood.

The only reason why we don’t know what exactly makes the model output an image of Garfield with boobs is the amount of data to sift through. Not because we don’t understand the processes.

magic_lobster_party@fedia.io · 2 months ago

Generalization is not a given. It’s possible to make complex models that perfectly memorizes 100% of the training data, but produces garbage results if the input diverges ever so slightly from the training.

This generalization is a process that’s not fully understood. Earlier architectures struggled with this level of generalization, but transformers seem to handle it well.

Prunebutt@slrpnk.net · 2 months ago

Not overfitting is hard, yes. But it’s not “we have no idea how/why this works”-hard.

Valmond@lemmy.world · 2 months ago

That goes for windows 11 too, and still we know how computers work.

magic_lobster_party@fedia.io · 2 months ago

Windows 11 is programmed by Microsoft engineers. I’m sure they have a good idea how it works. When you click a button, you get predictable results.

Neural networks is a different story. It’s difficult to predict what’s going to happen for a given prompt, and how adjustments to the weights affects the results.

There’s some article from last year where they found a ”golden gate” neuron in Claude. Changing it to be always on caused the model to always mention the golden gate in its responses. How and why this works is AFAIK not fully understood. For some reason the model managed to generalize the concept of golden gate into one single neuron.

Valmond@lemmy.world · 2 months ago

What a cute thought!

No one knows how “everything” works in old monolithic software. You just have to try and see what happens, and often you just doesn’t touch certain codebases because nobody really know the ramifications if you change something in them. Windiws 11 is probably way worse than any LLM. Try to share a simple folder on a simple home network and you’ll see some of the cruft.

Source: have worked on 30-40 year old monolithic software. In not one of those projects were there a single “engineer” who knew it all.

Neural networks has their fuzzy part of course, but software became not fully understandable a long time ago. IMO.

magic_lobster_party@fedia.io · 2 months ago

Of course, no single person fully understand the entirety of Windows. But I hope the people working with Windows understands at least a part of it.

The thing with LLMs is that no one really understands the purpose of one single neuron, how it relates to all other neurons, and how they together seem to be able to generalize high level concepts like golden gate bridge. It’s just too much to map it out.

Valmond@lemmy.world · 2 months ago

We do know how a single “neuron” relates to other neurons, it’s in the model. But what gets complicated is the vast amount of them, of course.

So yes, we don’t intrinsically get to understand it all, but I think we can understand what it does, a bit like windows 😁/j.

Fascinating subject, and we’re just scratching the beginning IMO.