• vrighter@discuss.tchncs.de
    link
    fedilink
    arrow-up
    6
    ·
    5 hours ago

    when you use reinforcement learning to punish the ai for saying “the sky is magenta”, you’re training it to “don’t say the sky is magenta”. You’re not training it to “don’t lie”. What about the infinite other ways the answer could be wrong though?