“And with newer methods of trai...”

Speech by Audrey Tang

And with newer methods of training language models, like “direct preference optimization,” which is taking a language model, showing it a set of approved answers and a set of rejected answers, it figures out the logic of what’s approved and what’s rejected. And even newer methods like “SPIN” — just show it the way that the fact checkers do their painstaking work, and it just learns from that train of thought.

2024-01-24 Interview with Your Undivided Attention podcast

Show context

SayIt

Keyboard shortcuts