For example, she would need to hear the sounds and identify which part in the mind map that I just showed you is this person talking about. That requires semantic analysis and very simple things, just speech recognition and gaze recognition. When many people are talking together, they need to figure out the attention, like the gaze of the various participants around, and then to handle the group dynamic.

Keyboard shortcuts

j previous speech k next speech