I literally based what I said on seeing papers and video essays by students using generative AI to perform specific tasks, and overcoming this issue. It’s not just about the data it is “learning” but also about how you “reward” it for doing what you intend it to do. Let it figure out how to win at a game, and it will cheat until you start limiting how it is allowed to win.
when you use reinforcement learning to punish the ai for saying “the sky is magenta”, you’re training it to “don’t say the sky is magenta”. You’re not training it to “don’t lie”. What about the infinite other ways the answer could be wrong though?
Then either you and/or those kids don’t understand the tech they are using.
Sure you can use reinforcement training to improve or shape a model in the way you would want it to be. However, as I said, the model doesn’t know what is true and what is not true. That data simply isn’t there and can’t ever be there. So training the model ‘not to lie’ simply isn’t a thing, it doesn’t “know” it’s lying, so it can’t prevent or control lies or truths.
Lets say you create a large dataset and you define in that data set whether something is true or false. This would be a pretty labour intensive job, but possible perhaps (setting aside the issue of truth often being a grey area and not a binary thing). If you instruct it only to re-iterate what is defined as true in the source data, it then loses all freedom. If you ask it a slightly different question that isn’t in the source data, it simply won’t have the data to know if the answer is true or false. So just like the way it currently functions, it will output an answer that seems true. An answer that logically fits after the question. It likes putting in those jigsaw pieces and the ones that fit perfectly must be true right? No, they have just as big of a chance of being totally false. Just because the words seem to fit, doesn’t mean it’s true. You can instruct it not to output anything unless it knows it is true, but that limits the responses to the source data. So you’ve just created a really inefficient and cumbersome search tool.
This isn’t an opinion thing or just a matter of improving the tech. The data simply isn’t there, the mechanisms aren’t there. There is no way an LLM can still do what it does and also tell the truth. No matter how hard the marketing machines are working to convince people it is an actual artificial intelligence, it is not. It’s a text prediction engine and that’s all it will ever be.
Current AI models have been trained to give a response to the prompt regardless of confidence, causing the vast majority of hallucinations. By incorporating confidence into the training and responding with “I don’t know”, similar to training for refusals, you can mitigate hallucinations without negatively impacting the model.
If you read the article, you’ll find the “destruction of ChatGPT” claim is actually nothing more than the “expert” making the assumption that users will just stop using AI if it starts occasionally telling users “I don’t know”, not any kind of technical limitation preventing hallucinations from being solved, in fact the “expert” is agreeing that hallucinations can be solved.
You’ve done a lot of typing and speak very confidently, but ironically, you seem to have only a basic understanding of how an LLM works and how they are trained, and are just parroting talking points that aren’t really correct.
I literally based what I said on seeing papers and video essays by students using generative AI to perform specific tasks, and overcoming this issue. It’s not just about the data it is “learning” but also about how you “reward” it for doing what you intend it to do. Let it figure out how to win at a game, and it will cheat until you start limiting how it is allowed to win.
when you use reinforcement learning to punish the ai for saying “the sky is magenta”, you’re training it to “don’t say the sky is magenta”. You’re not training it to “don’t lie”. What about the infinite other ways the answer could be wrong though?
Then either you and/or those kids don’t understand the tech they are using.
Sure you can use reinforcement training to improve or shape a model in the way you would want it to be. However, as I said, the model doesn’t know what is true and what is not true. That data simply isn’t there and can’t ever be there. So training the model ‘not to lie’ simply isn’t a thing, it doesn’t “know” it’s lying, so it can’t prevent or control lies or truths.
Lets say you create a large dataset and you define in that data set whether something is true or false. This would be a pretty labour intensive job, but possible perhaps (setting aside the issue of truth often being a grey area and not a binary thing). If you instruct it only to re-iterate what is defined as true in the source data, it then loses all freedom. If you ask it a slightly different question that isn’t in the source data, it simply won’t have the data to know if the answer is true or false. So just like the way it currently functions, it will output an answer that seems true. An answer that logically fits after the question. It likes putting in those jigsaw pieces and the ones that fit perfectly must be true right? No, they have just as big of a chance of being totally false. Just because the words seem to fit, doesn’t mean it’s true. You can instruct it not to output anything unless it knows it is true, but that limits the responses to the source data. So you’ve just created a really inefficient and cumbersome search tool.
This isn’t an opinion thing or just a matter of improving the tech. The data simply isn’t there, the mechanisms aren’t there. There is no way an LLM can still do what it does and also tell the truth. No matter how hard the marketing machines are working to convince people it is an actual artificial intelligence, it is not. It’s a text prediction engine and that’s all it will ever be.
Current AI models have been trained to give a response to the prompt regardless of confidence, causing the vast majority of hallucinations. By incorporating confidence into the training and responding with “I don’t know”, similar to training for refusals, you can mitigate hallucinations without negatively impacting the model.
If you read the article, you’ll find the “destruction of ChatGPT” claim is actually nothing more than the “expert” making the assumption that users will just stop using AI if it starts occasionally telling users “I don’t know”, not any kind of technical limitation preventing hallucinations from being solved, in fact the “expert” is agreeing that hallucinations can be solved.
You’ve done a lot of typing and speak very confidently, but ironically, you seem to have only a basic understanding of how an LLM works and how they are trained, and are just parroting talking points that aren’t really correct.