I’m curious what kinds of experiments we can have that go from here towards bridge making. Where we can actually measure the models ability to actually have formed the bridge, rather than find the most watered-down consensus.
j previous speech k next speech