They can just say, “Okay, I think AI should behave in this way, in that way, and so on,” and upvote and downvote each other’s sentiments. And the resulting matrix, when we use that to train Claude, that’s Anthropic’s AI, it is as powerful as Anthropic’s original version, but much more fair and much less discriminatory.

Keyboard shortcuts

j previous speech k next speech