BlueMonday1984@awful.systems to TechTakes@awful.systemsEnglish · 5 days agoFacebook Pushes Its Llama 4 AI Model to the Right, Wants to Present “Both Sides” [404 Media]www.404media.coexternal-linkmessage-square9fedilinkarrow-up140arrow-down10 cross-posted to: technology@lemmy.worldBoycottUnitedStates@europe.pubtechnology@lemmit.online
arrow-up140arrow-down1external-linkFacebook Pushes Its Llama 4 AI Model to the Right, Wants to Present “Both Sides” [404 Media]www.404media.coBlueMonday1984@awful.systems to TechTakes@awful.systemsEnglish · 5 days agomessage-square9fedilink cross-posted to: technology@lemmy.worldBoycottUnitedStates@europe.pubtechnology@lemmit.online
minus-squarecorbin@awful.systemslinkfedilinkEnglisharrow-up5·3 days agoIt’s well-known folklore that reinforcement learning with human feedback (RLHF), the standard post-training paradigm, reduces “alignment,” the degree to which a pre-trained model has learned features of reality as it actually exists. Quoting from the abstract of the 2024 paper, Mitigating the Alignment Tax of RLHF (alternate link): LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting pretrained abilities, which is also known as the alignment tax.
It’s well-known folklore that reinforcement learning with human feedback (RLHF), the standard post-training paradigm, reduces “alignment,” the degree to which a pre-trained model has learned features of reality as it actually exists. Quoting from the abstract of the 2024 paper, Mitigating the Alignment Tax of RLHF (alternate link):