Audrey Tang

Yes. A universal scoring function is, in effect, a theology. And like any theology that places all value in a transcendent reward, it can justify any earthly harm in pursuit of that reward. This is why utilitarian training — training toward a single abstract metric — is insufficient and even dangerous as a foundation for AI alignment.

鍵盤快捷鍵Keyboard shortcuts

j 下一段next speechk 上一段previous speech