A spellchecker doesn’t hallucinate new words. LLMs are not the tool for this job, at best it might be able to take some doctor write up and encode it into a different format, ie here’s the list of drugs and dosages mentioned. But if you ask it whether those drugs have adverse reactions, or any other question that has a known or fixed process for answering, then you will be better served writing code to reflect that process. LLMs are best for when you don’t care about accuracy and there is no known process that could be codified. Once you actually understand the problem you are asking it to help with, you can achieve better accuracy and efficiency by codifying the solution.
But doctors and nurses’ minds effectively hallucinate just the same and are prone to even the most trivial of brain farts like fumbling basic math or language slip-ups. We can’t underestimate the capacity to have the strengths of a supercomputer at least acting as a double-checker on charting, can we?
Accuracy of LLMs is largely dependent upon the learning material used, along with the rules-based (declarative language) pipeline implemented. Little different than the quality of an education that a human mind receives if they go to Trump University versus John Hopkins.
But doctors and nurses’ minds effectively hallucinate just the same and are prone to even the most trivial of brain farts like fumbling basic math or language slip-ups
The difference is that the practitioner can distinguish the difference from hallucination from fact while an LLM cannot.
We can’t underestimate the capacity to have the strengths of a supercomputer at least acting as a double-checker on charting, can we?
A supercomputer is only as powerful as it’s programming. This is avoiding the whole “if you understand the problem then you are better off writing a program than using an LLM” by hand waving in the word “supercomputer”. The whole “train it better” doesn’t get away from this fact either.
The difference is that the practitioner can distinguish the difference from hallucination from fact while an LLM cannot.
Sorry, what do you mean by this? Can you elaborate? Hundreds of thousands of medical errors occur annually from exhausted medical workers doing something in error and ultimately “hallucinating,” and not having caught themselves. Might, like a spellchecker, an AI have tapped them on the proverbial shoulder to alert them of such an error?
A supercomputer is only as powerful as it’s programming.
As a software engineer, I understand that; but the capacity to aggregate large amounts of data and to provide a probabilistic determination on risk-assessment simply isn’t something a single, exhausted physician’s mind can do in a moment’s notice no differently than calculating Pi to a million digits in a second. I’m not even opposed to more specialized LLMs being deployed as a check to this, of course.
Example: I know most logical fallacies pretty well, and I’m fairly well versed on current-events, US history, civics, politics, etc. But from time-to-time, I have an LLM analyze conversations with, say, Trump supporters to double-check not only their writing, but my own. It has pointed out fallacies in my own writing that I myself missed; it has noted deviations in facts and provided sources that upon closer analysis, I agreed with. Such a demonstration of auditing suggests it can equally be quite rapidly applied to healthcare in a similar manner, with some additional training material perhaps, but under the same principle.
Since you are a software engineer you must know the difference between deterministic software like a spellchecker and something stochastic like an LLM. You must also understand the difference between a well defined process like a spellchecker and an undefined behavior like an LLM hallucinating. Now ask your LLM if comparing these two technologies in the way you are is a bad analogy. If the LLM says it is a good analogy then you are prompting it wrong. The fact that we can’t agree on what an LLM should say on this matter and that we can get it to say either outcome demonstrates that an LLM cannot distinguish fact from fiction, rather it makes these determinations on what is effectively a vibe check.
How about instead you provide your prompt and its response. Then you and I shall have discussion on whether or not that prompt was biased and you were hallucinating when writing it, or indeed the LLM was at fault — shall we?
At the end of day, you still have not elucidated why — especially within the purview of my demonstration of its usage in conversation elsewhere and its success in a similar implementation — it cannot simply be used as double-checker of sorts, since ultimately, the human doctor would go, “well now, this is just absurd” since after all, they are the expert to begin with — you following?
So, naturally, if it’s a second set of LLM eyes to double-check one’s work, either the doctor will go, “Oh wow, yes, I definitely blundered when I ordered that and was confusing charting with another patient” or “Oh wow, the AI is completely off here and I will NOT take its advice to alter my charting!”
Somewhat ironically, I gather the impression one has a particular prejudice against these emergent GPTs and that is in fact biasing your perception of their potential.
EDIT: Ah, just noticed my tag for you. Say no more. Have a nice day.
A spellchecker doesn’t hallucinate new words. LLMs are not the tool for this job, at best it might be able to take some doctor write up and encode it into a different format, ie here’s the list of drugs and dosages mentioned. But if you ask it whether those drugs have adverse reactions, or any other question that has a known or fixed process for answering, then you will be better served writing code to reflect that process. LLMs are best for when you don’t care about accuracy and there is no known process that could be codified. Once you actually understand the problem you are asking it to help with, you can achieve better accuracy and efficiency by codifying the solution.
But doctors and nurses’ minds effectively hallucinate just the same and are prone to even the most trivial of brain farts like fumbling basic math or language slip-ups. We can’t underestimate the capacity to have the strengths of a supercomputer at least acting as a double-checker on charting, can we?
Accuracy of LLMs is largely dependent upon the learning material used, along with the rules-based (declarative language) pipeline implemented. Little different than the quality of an education that a human mind receives if they go to Trump University versus John Hopkins.
The difference is that the practitioner can distinguish the difference from hallucination from fact while an LLM cannot.
A supercomputer is only as powerful as it’s programming. This is avoiding the whole “if you understand the problem then you are better off writing a program than using an LLM” by hand waving in the word “supercomputer”. The whole “train it better” doesn’t get away from this fact either.
Sorry, what do you mean by this? Can you elaborate? Hundreds of thousands of medical errors occur annually from exhausted medical workers doing something in error and ultimately “hallucinating,” and not having caught themselves. Might, like a spellchecker, an AI have tapped them on the proverbial shoulder to alert them of such an error?
As a software engineer, I understand that; but the capacity to aggregate large amounts of data and to provide a probabilistic determination on risk-assessment simply isn’t something a single, exhausted physician’s mind can do in a moment’s notice no differently than calculating Pi to a million digits in a second. I’m not even opposed to more specialized LLMs being deployed as a check to this, of course.
Example: I know most logical fallacies pretty well, and I’m fairly well versed on current-events, US history, civics, politics, etc. But from time-to-time, I have an LLM analyze conversations with, say, Trump supporters to double-check not only their writing, but my own. It has pointed out fallacies in my own writing that I myself missed; it has noted deviations in facts and provided sources that upon closer analysis, I agreed with. Such a demonstration of auditing suggests it can equally be quite rapidly applied to healthcare in a similar manner, with some additional training material perhaps, but under the same principle.
Since you are a software engineer you must know the difference between deterministic software like a spellchecker and something stochastic like an LLM. You must also understand the difference between a well defined process like a spellchecker and an undefined behavior like an LLM hallucinating. Now ask your LLM if comparing these two technologies in the way you are is a bad analogy. If the LLM says it is a good analogy then you are prompting it wrong. The fact that we can’t agree on what an LLM should say on this matter and that we can get it to say either outcome demonstrates that an LLM cannot distinguish fact from fiction, rather it makes these determinations on what is effectively a vibe check.
How about instead you provide your prompt and its response. Then you and I shall have discussion on whether or not that prompt was biased and you were hallucinating when writing it, or indeed the LLM was at fault — shall we?
At the end of day, you still have not elucidated why — especially within the purview of my demonstration of its usage in conversation elsewhere and its success in a similar implementation — it cannot simply be used as double-checker of sorts, since ultimately, the human doctor would go, “well now, this is just absurd” since after all, they are the expert to begin with — you following?
So, naturally, if it’s a second set of LLM eyes to double-check one’s work, either the doctor will go, “Oh wow, yes, I definitely blundered when I ordered that and was confusing charting with another patient” or “Oh wow, the AI is completely off here and I will NOT take its advice to alter my charting!”
Somewhat ironically, I gather the impression one has a particular prejudice against these emergent GPTs and that is in fact biasing your perception of their potential.
EDIT: Ah, just noticed my tag for you. Say no more. Have a nice day.