‘But there is a difference between recognising AI use and proving its use. So I tried an experiment. … I received 122 paper submissions. Of those, the Trojan horse easily identified 33 AI-generated papers. I sent these stats to all the students and gave them the opportunity to admit to using AI before they were locked into failing the class. Another 14 outed themselves. In other words, nearly 39% of the submissions were at least partially written by AI.‘

Article archived: https://web.archive.org/web/20251125225915/https://www.huffingtonpost.co.uk/entry/set-trap-to-catch-students-cheating-ai_uk_691f20d1e4b00ed8a94f4c01

  • pumpkin_spice@lemmy.today
    link
    fedilink
    arrow-up
    8
    ·
    9 hours ago

    when you make it clear to be honest

    It has no idea what honesty is. It has no idea what bias is.

    It is fancy auto-complete. And it’s wrong so often (like 40% of the time) that it should not be used to seek out factual information that the prompter doesn’t already know.

    • definitemaybe@lemmy.ca
      link
      fedilink
      arrow-up
      3
      ·
      8 hours ago

      it should not be used to seek out factual information that the prompter doesn’t already know.

      Eh… Depends on the importance and purpose of the information.

      If you’re just trying to generate ideas for fiction from historical precedents, it doesn’t matter if it’s accurate. Or if you’re using it as a starting point, then following the links to check the original source (like I do all the time for Linux terminal commands).

      Hell, I often use Linux terminal commands from Google’s search results AI box—I know enough to be able to parse what AI is suggesting (and identify when the proposed commands don’t make sense), and enough to undo what I’m doing if it doesn’t work. Saves a lot of time.

      Copilot fixed some SQL syntax issues I had yesterday, too. 100% accuracy on that, despite it being a massive query with about a dozen nested subqueries. (Granted, I gave a very detailed prompt…) But, again, this was low stakes–who cares if a SELECT query fails to execute.