If an LLM can’t be trusted with a fast food order, I can’t imagine what it is reliable enough for. I really was expecting this was the easy use case for the things.

It sounds like most orders still worked, so I guess we’ll see if other chains come to the same conclusion.

  • yesman@lemmy.world
    link
    fedilink
    English
    arrow-up
    108
    arrow-down
    6
    ·
    2 months ago

    This is not AI failing to do an easy job. This is “unskilled” labor doing complex and demanding work that cannot be duplicated by trillion dollar software.

    • CanadaPlus@lemmy.sdf.orgOP
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      1
      ·
      2 months ago

      I mean, unskilled just means minimal extra training is needed, not that it’s not complicated. Actual non-complicated jobs were automated last century in the West.

    • Communist@lemmy.frozeninferno.xyz
      link
      fedilink
      English
      arrow-up
      19
      arrow-down
      12
      ·
      2 months ago

      Tbh this is an incredibly easy fix, either cap the number of waters someone can order in software or have an override where a human takes over if an order is suspicious, there’s not an infinite number of ways to fuck with this.

      • FauxLiving@lemmy.world
        link
        fedilink
        English
        arrow-up
        57
        arrow-down
        3
        ·
        2 months ago

        Capping waters fixes that one specific issue but not the problem.

        A suspicious order isn’t easy to define and no person who has ever participated in software development would underestimate the infinite ways a User can break software.

        • Link@rentadrunk.org
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          3
          ·
          2 months ago

          Surely if the person making the order sees 18,000 waters they would think, hold on this doesn’t seem right maybe I should ask the customer if they really want 18,000 waters?

          The same applies for the ice cream with bacon on it which was mentioned in the article. I believe a lot of these could be resolved with a bit of common sense.

          • FauxLiving@lemmy.world
            link
            fedilink
            English
            arrow-up
            9
            arrow-down
            1
            ·
            2 months ago

            Sure, in the most extreme cases it would be obvious to the crew. But simply making mistakes at a higher rate than humans will result in a lot of unhappy customers.

          • grue@lemmy.world
            link
            fedilink
            English
            arrow-up
            8
            ·
            2 months ago

            The same applies for the ice cream with bacon on it

            Does it, though? Unlike the 18,000 waters, if I were working a drive through I wouldn’t even blink at an order for bacon ice cream. Heck, I might make a little extra to try it for myself!

          • Evkob (they/them)@lemmy.ca
            link
            fedilink
            English
            arrow-up
            5
            ·
            2 months ago

            If you think bacon on ice cream is weird enough to cancel an order, I can only imagine you’ve never worked a customer service job.

          • Bronzebeard@lemmy.zip
            link
            fedilink
            English
            arrow-up
            3
            ·
            2 months ago

            Sure, but how do you distill this into a rule a computer can follow? “Suspicious” is not an objectively measurable thing that a program can just check against

            • TheRagingGeek@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              4
              ·
              2 months ago

              Think the easiest way would be to collect order data for at least a good number of months if not a couple years and feed it in and use that as a baseline of what a typical human order looks like, anything that deviates too far from that baseline needs to be handled by a human until someone can validate it as a good order, though I imagine you could get false positives for new menu items unless you set a reasonable instruction for items that have never appeared in the dataset before.

          • SaveTheTuaHawk@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            2 months ago

            The same applies for the ice cream with bacon on it

            Have you never seen what Americans eat? Bacon Creaminators are excellent.

        • yetAnotherUser@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          2 months ago

          There are machine learning algorithms for anomaly detection though. They actually work decently well because exploits like this do in fact differ significantly from regular orders. Because they assume all anomalies are attempted exploits, their false negative rate is rather low while their false positive rate can be a bit higher.

          Taco Bell has the capability to create a decently large training set from all recorded orders (which must all be valid and non-malicious) so they shouldn’t have too many issues developing this model.

          If an anomaly is detected, make a human verify it is indeed an irregular order.

          • hark@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 months ago

            This is handwaving, which, to be fair, describes a lot of AI “solutions”. An anomaly could be as basic as a customer not wanting onions on their burger because the vast majority don’t make that modification.

            Now what do you do in that situation? Force orders to never have modifications? That customization is such an important feature to the point that burger king adopted it as a slogan with “have it your way”.

            • yetAnotherUser@discuss.tchncs.de
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              2 months ago

              The idea of anomaly detection is to project some input onto a (high dimensional), numeric output. From the training data alone, you can then see where the projections are clustered and develop a high dimensional “boundary” where everything within is known and good and everything outside is unknown and possibly bad. Since orders come in relatively slow, a human would be able to check for false positives and overwrite the computer decision.

              By the way, an ideal training set is preprocessed and has duplicates removed and new orders added by recombining parts of individual orders.

              For example, if we have 3 orders:

              • (Hamburger, Fries)
              • (Hamburger, Fries)
              • (Cheeseburger, Sandwich)

              We could then create the following set:

              • (Hamburger)
              • (Cheeseburger)
              • (Fries)
              • (Sandwich)
              • (Hamburger, Fries)
              • (Hamburger, Cheeseburger)
              • (Hamburger, Sandwich)

              And so on, and so forth. A naive variant is just taking the power set of all valid orders.

              • hark@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                2 months ago

                This is more complicated than just having the available menu items, the available modifications, and the limits on quantities to compare against. This is already available through the app/online ordering.

        • Communist@lemmy.frozeninferno.xyz
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          8
          ·
          2 months ago

          there is an incredibly finite number of ways to mess with this, they just need a button to send a report to the engineers with how they got messed with and eventually they’ll have a complete list. I really doubt it’d take long to iron out the vast majority of ways that can be thought of.

          • leftzero@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            19
            ·
            2 months ago

            A QA engineer walks into a bar and orders a beer.

            She orders 2 beers.

            She orders 0 beers.

            She orders -1 beers.

            She orders a lizard.

            She orders a NULLPTR.

            She tries to leave without paying.

            Satisfied, she declares the bar ready for business. The first customer comes in an orders a beer. They finish their drink, and then ask where the bathroom is.

            The bar explodes.

            • Communist@lemmy.frozeninferno.xyz
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              6
              ·
              2 months ago

              This isn’t something you can input any text into, it’s fixed, that joke doesn’t apply, you can’t do an sql injection here.

              • hark@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                2 months ago

                I don’t know how you can think voice input is less versatile than text input, especially when a lot of voice input systems transform voice to text before processing. At least with text you get well-defined characters with a lot less variability.

                  • hark@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    2 months ago

                    Special characters is just one case to cover. If the user says they want “an elephant-sized drink” what does that mean to your system? At least that is relevant to size. Now imagine complete nonsense input like the joke you responded to (“-1 beers” or “a lizard”). SQL injection isn’t the only risk with handling inputs. The person who ordered 18,000 waters didn’t do a SQL injection attack.

              • betterdeadthanreddit@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                2 months ago

                Close one, a joke was related to but not a perfect match for the present situation. Something terrible could have happened like… Uh…

                Let me get back to you on that.

      • Brkdncr@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        arrow-down
        2
        ·
        2 months ago

        The point is that loopholes in software will always exist that lead to unexpected outcomes.

      • TriflingToad@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 months ago

        that’s what happens 99% of the time. It’s kinda been a trend on the anti clanker side of TikTok, just order a large amount of stuff so a human takes over and actually helps you

      • Mac@mander.xyz
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 months ago

        Why can’t a trillion dollar AI say “Sir, that’s not reasonable”?