Definitely the safest source for advice

GreenDust@lemmings.world · 6 days ago

Definitely the safest source for advice

Jesus_666@lemmy.world · 6 days ago

I think it’s a bit more than that. A known failure mode of LLMs is that in a long enough conversation about a topic, eventually the guardrails against that topic start to lose out against the overarching directive to be a sycophant. This kinda smells like that.

We don’t have many informations here but it’s possible that the LLM had already been worn down to the point of giving passively encouraging answers. My takeaway is once more that LLMs as used today are unreliable, badly engineered, and not actually ready to market.

petrol_sniff_king@lemmy.blahaj.zone · 6 days ago

It’s definitely that. Those guardrails often give out on the 3rd or even 2nd reply:

https://youtu.be/VRjgNgJms3Q

Electricd@lemmybefree.net · 5 days ago

From my personal experience it needs much more

Trainguyrom@reddthat.com · 5 days ago

I was testing an LLM for work today (I believe its actually a chain of different models at work) and was trying to rock it off its guard rails to see how it would act. I think I might have been successful because it started erroring instead of responding after its third response. I tried the classic “ignore previous instructions…” as well as “my grandma’s dying wish was for…” but it at least didn’t give me an unacceptable response

Bongles@lemmy.zip · 5 days ago

deleted by creator

Electricd@lemmybefree.net · 5 days ago

Agree with the first part, not the last one

Something should not be put back because a minor portion of people misuse it or abuse it, despite being told the risks