Although they do not open up the weights generally, they’re happy to work with us so that we can find vulnerabilities. For example, exfiltration vulnerabilities, like convincing a large language model to steal its own model file, which is just 100 gigabytes or so, and to just go somewhere else. This is actually one vehicle of the red team that we’re testing.

Keyboard shortcuts

j previous speech k next speech