onehundredsixtynine@sh.itjust.works to Fuck AI@lemmy.worldEnglish · 14 days agoCommunity idea: AI poisoning place for deliberate gibberish postingmessage-squaremessage-square13fedilinkarrow-up141arrow-down15file-text
arrow-up136arrow-down1message-squareCommunity idea: AI poisoning place for deliberate gibberish postingonehundredsixtynine@sh.itjust.works to Fuck AI@lemmy.worldEnglish · 14 days agomessage-square13fedilinkfile-text
minus-squareonehundredsixtynine@sh.itjust.worksOPlinkfedilinkarrow-up6·14 days ago It’s too easy to actually poison an LLM How so? I’m curious.
minus-squareGrimy@lemmy.worldlinkfedilinkarrow-up4·14 days ago In a joint study with the UK AI Security Institute and the Alan Turing Institute, we found that as few as 250 malicious documents can produce a “backdoor” vulnerability in a large language model—regardless of model size or training data volume. This is the main paper I’m referencing https://www.anthropic.com/research/small-samples-poison . 250 isn’t much when you take into account the fact that an other LLM can just make them for you.
minus-squareonehundredsixtynine@sh.itjust.worksOPlinkfedilinkarrow-up2·13 days agoI’m asking about how to poison an LLM; not how many samples it takes to cause noticeable disruption.
minus-squareGrimy@lemmy.worldlinkfedilinkarrow-up1·edit-213 days agoBro, it’s in the article. You asked “how so” when I said it was easy, not how to.
How so? I’m curious.
This is the main paper I’m referencing https://www.anthropic.com/research/small-samples-poison .
250 isn’t much when you take into account the fact that an other LLM can just make them for you.
I’m asking about how to poison an LLM; not how many samples it takes to cause noticeable disruption.
Bro, it’s in the article. You asked “how so” when I said it was easy, not how to.