Yeah. And I imagine there’s some extra mechanism design needed to make technical coherence in a domain where you can’t get a clear reward signal, because you can’t check whether your plan will work without testing the thing, and if you test it, whoops, if you got it wrong.

Keyboard shortcuts

j previous speech k next speech