SciShow Is Lying to You about AI. Here are the receipts.

dumnezero@piefed.social · 2 months ago

SciShow Is Lying to You about AI. Here are the receipts.

magic_lobster_party@fedia.io · 2 months ago

Not a fan of this guy. He’s dead set on that AI won’t progress at all in the near future.

You can argue whether AI is progressing faster than the Manhattan project or not, but these things are true:

AI has progressed fast
We have no idea how it works
We have no idea how fast it will progress in the near future.

Think about where AI was 10 years ago. Cutting edge AI was able to accept an video, and put bounding boxes around a predefined set of objects it’s able to recognize (see You Only Look Once paper, 2015). That was about it.

10 years before that, cutting edge AI was maybe digit recognition. I’m not sure.

Today current cutting edge AI goes far beyond that. Just imagine where it might be in another 10 years. I think it’s frightening, considering how much AI slop we’re enduring today.

very_well_lost@lemmy.world · 2 months ago

We have no idea how it works

I’m so sick of seeing this bullshit.

You may not know how it works, and the AI industry probably wants you to think that no one knows how it works, but it’s just not true.

Generative pre-trained transformers are well understood, well documented, and there’s no shortage of resources freely available online to teach you how they work. Ditto for other advanced AI systems.

They are complex, sure, but they’re not inscrutable. Saying that no one knows how AI works is like saying no one knows how the weather works — which again, is simply not true. Weather is complicated and its behavior is hard to predict because of the massive number of variables involved, but we know how it works at a fundamental level. It’s not magic, it’s not angels bowling or whatever.

AI is just software, and we know how it fucking works.

magic_lobster_party@fedia.io · 2 months ago

We know how each individual part work. That’s just basic math.

We don’t know for sure how all trillion parts together produce the results they do. You can’t debug the model step by step to see how the prompt ”generate image of a penguin” produces an image of a penguin, and not an ice bear. That what people mean with ”we don’t know how AI works”.

very_well_lost@lemmy.world · 2 months ago

Okay, but who cares? “Complex systems are difficult to predict” is a mathematical insight that’s like 2 centuries old at this point… and it hasn’t hindered us at all from gaining deep insights into how both individual complex systems work and how complex systems as a general class of phenomena work. I can’t keep track of all the masses and velocities of every individual air molecule in the room I’m sitting in, but I still know how the interactions of those particles give rise to the temperature and air pressure and general behavior of the atmosphere in the room.

People know how this shit works, and anyone telling you otherwise is either willfully ignorant or internationally lying to you to feed a hype cycle with an end goal of making your life worse. People can’t afford to remain uneducated about this stuff anymore.

magic_lobster_party@fedia.io · 2 months ago

What’s interesting is how these complex models produce anything useful at all. We could very well have complex models that don’t produce anything other than random noise.

Prunebutt@slrpnk.net · 2 months ago

The reason why “we” have these models because they were deliberately trained not to output random noise. That part is well understood.

The only reason why we don’t know what exactly makes the model output an image of Garfield with boobs is the amount of data to sift through. Not because we don’t understand the processes.

magic_lobster_party@fedia.io · 2 months ago

Generalization is not a given. It’s possible to make complex models that perfectly memorizes 100% of the training data, but produces garbage results if the input diverges ever so slightly from the training.

This generalization is a process that’s not fully understood. Earlier architectures struggled with this level of generalization, but transformers seem to handle it well.

Prunebutt@slrpnk.net · 2 months ago

Not overfitting is hard, yes. But it’s not “we have no idea how/why this works”-hard.

Valmond@lemmy.world · 2 months ago

That goes for windows 11 too, and still we know how computers work.

magic_lobster_party@fedia.io · 2 months ago

Windows 11 is programmed by Microsoft engineers. I’m sure they have a good idea how it works. When you click a button, you get predictable results.

Neural networks is a different story. It’s difficult to predict what’s going to happen for a given prompt, and how adjustments to the weights affects the results.

There’s some article from last year where they found a ”golden gate” neuron in Claude. Changing it to be always on caused the model to always mention the golden gate in its responses. How and why this works is AFAIK not fully understood. For some reason the model managed to generalize the concept of golden gate into one single neuron.

Valmond@lemmy.world · 2 months ago

What a cute thought!

No one knows how “everything” works in old monolithic software. You just have to try and see what happens, and often you just doesn’t touch certain codebases because nobody really know the ramifications if you change something in them. Windiws 11 is probably way worse than any LLM. Try to share a simple folder on a simple home network and you’ll see some of the cruft.

Source: have worked on 30-40 year old monolithic software. In not one of those projects were there a single “engineer” who knew it all.

Neural networks has their fuzzy part of course, but software became not fully understandable a long time ago. IMO.

magic_lobster_party@fedia.io · 2 months ago

Of course, no single person fully understand the entirety of Windows. But I hope the people working with Windows understands at least a part of it.

The thing with LLMs is that no one really understands the purpose of one single neuron, how it relates to all other neurons, and how they together seem to be able to generalize high level concepts like golden gate bridge. It’s just too much to map it out.

Valmond@lemmy.world · 2 months ago

We do know how a single “neuron” relates to other neurons, it’s in the model. But what gets complicated is the vast amount of them, of course.

So yes, we don’t intrinsically get to understand it all, but I think we can understand what it does, a bit like windows 😁/j.

Fascinating subject, and we’re just scratching the beginning IMO.

arnitbier@sh.itjust.works · edit-2 2 months ago

So general understanding seems to be that LLMs which are almost comprehensively understood is not the same as artificial intelligence which is really only conceptually understood for the most part. Still too new and not fully tested, like chemistry when it was still really being worked out. We know definitely some of how it might work but MOST of it will be being developed and debated over a long period of time still to come. So its extremely fair to say we do not know how long that will take and how fast it will develop because we don’t have enough information to establish that yet. Including not having the minutia of how the DEVELOPED systems truly operate which is what most people are taught about these days and (I think) they were pointing out.

So on that note it seems worth bringing up what the bigger problem here that pissing people off so much, the TERMs used to describe the issue and lack of concrete and agreed upon understanding that MOST people share about the subject that were even discussing makes this tough to get through without everybody being wrong in some capacity or another.

So you are def kinda wrong, they might be but I dont think that they are really. And to some degree people in the damn field of AI right fucking now will be wrong too

So, grace brother, remember the learning process and if your goal is to educate, there are more effective ways then that. But also please keep participating, remembering that most people here are simply trying to add their relevant experience and should be treated as such.

Edit: Yall are… AGI doesn’t exist and we don’t know how to make it and we don’t know how fast ANI is going to develop even slightly. Then there’s how machine-learning so-called AI is NOT generally considered AI by the actual developed standards in the field of AI, and just because they UNDERSTAND how machine learning works really well and have some blueprints for what he calls “advanced AI systems” which are still just machine learning systems doesn’t suddenly change any of this.

We don’t know how intelligence works. We can’t know how AI works yet. We know system does this if we do that. Reminding of early chemistry fields.

So the problem HERE becomes that AI, AI and AI all mean different things to different people even in the field of AI which also means something special and to quote OP here “Im so fucking sick of seeing that bullshit” 💀

Stop bullshitting you know what people are talking about in a field as brand spanking new (read: underdeveloped) as this. But people get your frustration with this one thing even if they are totally wrong about it.

beetus@lemmy.world · edit-2 2 months ago

Think about where AI was 10 years ago. Cutting edge AI was able to accept an video, and put bounding boxes around a predefined set of objects it’s able to recognize (see You Only Look Once paper, 2015). That was about it.

10 years before that, cutting edge AI was maybe digit recognition. I’m not sure.

Your own understanding of history is incorrect https://en.wikipedia.org/wiki/Timeline_of_artificial_intelligence?wprov=sfla1

I have no comment on this vid or the scishow, but you’re comment is ignorant on the reality of progression of ai-related fields. Having computers “learn” has been a thing for much longer than 20 years.

magic_lobster_party@fedia.io · 2 months ago

Link me one AI model that was able to track cars, cats, bicycles and many other objects all in real time before YOLO, and correct me where the state of cutting edge image processing was in 2005.

Today we have Coca Cola thinking it’s a good idea to make an AI slop Christmas commercial for the second year in a row. This was unimaginable 5 years ago.

Son_of_Macha@lemmy.cafe · 2 months ago

You just need to show us that identification actually working, none of it does the mistakes are terrible

GraveyardOrbit@lemmy.zip · 2 months ago

deleted by creator

magic_lobster_party@fedia.io · 2 months ago

Transformers is based on backpropagation algorithms, which was invented in the 1970s. From that lens, backpropagation has seen a lot of evolution through the years. Transformers is just one stepping stone in that journey, and it likely won’t be the last.

GraveyardOrbit@lemmy.zip · 2 months ago

deleted by creator

magic_lobster_party@fedia.io · 2 months ago

There will definitely be new backpropagation based architectures in the future. It won’t stop at transformers, just like how convolutional neural networks weren’t the end of the line.

If you’re only looking in the context of transformer models, then sure it might look like a plateau. But that’s a very narrow timeframe to look at. If we instead use 1970s as a timeframe, then things are progressing very fast. Almost overnight the internet is polluted by AI slop shit and people are having AI slop girlfriends. Things are going to get worse.

GraveyardOrbit@lemmy.zip · 2 months ago

deleted by creator

magic_lobster_party@fedia.io · 2 months ago

Educate me. Why is transformers the end of the line?

GraveyardOrbit@lemmy.zip · 2 months ago

deleted by creator

magic_lobster_party@fedia.io · 2 months ago

I’ve never mentioned AGI. Nor did SciShow either IIRC.

AGI or not, it’s going to get a lot worse.

brucethemoose@lemmy.world · edit-2 2 months ago

Transformers has plateaued. Hence the pursuit of alternative architectures.
We know how it works. It’s built and rebuilt from scratch, it’s one of the most heavily studied systems on the planet. The research is open.
Scaling has plateaued, hence we have a pretty good trajectory for LLMs specifically; towards increasingly efficient tool use. It’s clear that “AGI” research will go down a different path.

See this interview from a GLM dev for a more grounded take on what the labs are feeling now:

https://www.chinatalk.media/p/the-zai-playbook

https://m.youtube.com/watch?v=Q0TXO8BBqhE

You make a good point in how much the applications change each decade. What we have 10 years from now will be unreal… That being said, I think a lot of past gains were facilitated by picking low hanging hardware/framework fruit.

In 2005, we had Pentium 4s.

In 2015, researchers were hacking stuff onto GTX 780s with doubled up VRAM, no specialized blocks, frankly primitive tooling/APIs and few libraries. PyTorch was a shell of its current self.

In 2025, we have now scaled up to massive interconnects and dedicated datacenter accelerators with mature software frameworks with tons of libraries. We have wafer sized inference accelerators and NPUs for deployment.

But shrinks are slowing, we’ve already scaled up past diminishing returns. In 2035… I don’t see the scale or software environment being significantly different? It seemed like bitnet was going to change everything for a hot moment (turning expensive matmuls into addition, and blowing up the whole software/ASIC pipe), but that hasn’t panned out.