My current perception is that at some point, some set of systems get way beyond human level in relatively short order. If we haven’t got a clear understanding of how to make each step of that recursive self-improvement stay stable—the “ tiling problem “—we risk failure. For instance, when you use Claude 7 to build Claude 8, you might accidentally remove some of the things that made Claude 7 friendly.

Keyboard shortcuts

j previous speech k next speech