AI = bad, I know, but do people order “water, please” instead of “a glass of water, please” ?
Unless you want to end up with an expensive bottle of french water instead of a single glass of tap water
I tried using Cursor IDE and Claude Sonnet 4 to make an extension for Blender, and it keeps getting to the exact same point (super basic functions) of development, and then constantly breaking it when I try to get it to fine tune what i need to be done… This comic is accurate af.
Haven’t used any coding LLMs. I honestly have no clue about the accuracy of the comic. Can anyone enlighten me?
Yeah kinda. I ask it to do something simple like create a a typescript interface for some JSON and it just gives me what I want… most of the time.
Other times it will explain to me what JSON is, what Typescript is, what interfaces are and how they’re used, blah blah, and somewhere in there there’s the code I actually wanted. Once it helpfully commented the code… in Korean. Even when it works and comments things in English the comments can be kinda useless since it doesn’t actually know what I’m doing.
It’s trying to give you what you want but can sometimes get confused about what you’re asking for and give a bunch of stuff you didn’t actually want. So yeah, the comic is accurate… on occasion. But many times LLMs will give good results, and it’s getting better, so it’ll mostly work ok for simple requests. But yeah, sometimes it’ll give you a lot more stuff than what you wanted.
I use them frequently, they’re extremely helpful just don’t get it to write everything.
As for the comic, it’s pretty inaccurate. The only one that I find true is the too much water, sometimes the bots like to take … longer methods.
From what I understand of LLMs your assessment does seem likely to me. LLMs might actually be pretty accurate when asked to do relatively simpler, shorter tasks.
The comic is only accurate if you expect it do everything for you, you’re bad at communicating, and you’re using an old model. Or if you’re just unlucky
I point blank refuse to use them. I’ve seen how they’ve affected my coworker and my boss - these two people now simply cannot read documentation, do not trust their own abilities to write code, and cannot debug anything that they write. My job has become more difficult since this shit started being pushed on us all.
They’re okay for tasks which are reasonably a single file. I use them for simple Python scripts since they generally spit out something very similar to what I’d write, just faster. However there is a tipping point where a task becomes too complex and they fall on their face and it becomes faster to write the code yourself.
I’m never going to pay for AI, so I’m really just burning the AI company’s money as I do it, too.
I use llms to write small scripts because I’m too lazy to learn bash and ms cmd and regex and so far have not ruined anything.
they suck
The only thing I trust it with is refactoring for readability and writing scripts. But I also despise LLMs, so that’s all I’d give them.
Much like Amazon has an incentive to not show you the specific thing it knows you’re searching for, people theorize that these interfaces are designed to burn through your tokens.
I doubt that’s the case, currently.
Right now, there’s a lot of genuine competition in the AI space, so they’re actually trying to out compete one another for market share. It’s only once users are locked into using a particular service that they begin deliberate enshittification with the purpose of getting more money, either from paying for tokens, or like Google did when it deliberately made search quality worse so people would see more ads (“What are you gonna do, go to Bing?”)
By contrast, if ChatGPT sucks, you can locally host a model, use one from Anthropic, Perplexity, any number of interfaces for open source (or at least, source-available) models like Deepseek, Llama, or Qwen, etc.
It’s only once industry consolidation really starts taking place that we’ll see things like deliberate measures to make people either spend more on tokens, or make money from things like injecting ads into responses.
if ChatGPT sucks
Most people don’t know anything beyond ChatGPT and Copilot.
If we are talking programmers, maybe include claude, gemini, deepseek and perplexity search, though this is not always true.
…Point being, OpenAI does have a short term ‘default’ and known brand advantage, unfortunately.
That being said, there’s absolutely manipulation of LLMs, though not what OP is thinking persay. I see more of:
-
Benchmaxxing with a huge sycophancy bias (which works particularly well in LM Arena).
-
Benchmaxxing with massive thinking blocks, which is what OP is getting at. I’ve found Qwen is particularly prone to this, and it does drive up costs.
-
Token laziness from some of OpenAI’s older models, as if they were trained to give short responses to save GPU time.
-
“Deep Frying” models for narrow tasks (coding, GPQA style trivia, math, things like that) but making them worse outside of that, especially at long context.
-
…Straight up cheating by training on benchmark test sets.
-
Safety training to a ridiculous extent with stuff like Microsoft Phi, OpenAI, Claude, and such, for political reasons and to avoid bad PR.
In addition, ‘free’ chat UIs are geared for gathering data they can use to train on.
You’re right that there isn’t much like ad injection or deliberate token padding yet, but still.
-
I think it’s more about extracting money from normies, not someone savvy enough to run a model locally. And I don’t know if they do or don’t, I was just trying to explain the comic.
Wouldn’t a waiter AI be trained on a dataset of food orders and hence know exactly what an order of water would be by the context?
Some days it will be but other days it won’t be. Most of the time it can save me typing because it’ll do what I want. Sometimes (for similar tasks in the same context) it’s just be completely off. Once it helpfully commented my code… in Korean.
LLMs are like a box of chocolates, you never know what you’re gonna get.
That is a good depiction
I just let it create a function in a temporary file that takes specific parameters because it always tries to scramble my project
Oh yes, give me AI assistant, I will whisper sweet nothing and it will give me the moon, your moon.