My takeaway epistemic stance is I still expect that before there are strong agentic AI systems you need to do the kind of alignment based on theory which scales arbitrarily and is in some sense unified to get a stable Culture-style good state with high power AIs involved. However, I think the kind of work you’re looking at — improving sensemaking and coordination both globally and within the alignment community — could make it much more likely that humanity figures this out (or finds a way around my concerns), so I endorse it as among the most useful approaches I’ve seen people take.

Keyboard shortcuts

j previous speech k next speech