• xep@discuss.online
    link
    fedilink
    arrow-up
    3
    ·
    4 days ago

    Sycophancy is very likely an architectural failure in reinforcement learning from human feedback. it’s definitely indefensible, but I’m unsure if it’s intentional. Probably very difficult to address.

    • panda_abyss@lemmy.ca
      link
      fedilink
      arrow-up
      6
      ·
      4 days ago

      I think it’s unintentional, but LLM arena style benchmarks really favour sycophantic models, and it makes for a stickier product when your users are becoming emotionally dependent on it.