Presumably, it would mean "when anything other than that secondary processor comes on." I.e., when it stops throwing away the buffer containing your speech at a hardware level, and starts instead feeding it through its local parsers and to the cloud in a way that could result in information from your speech being captured.
That would require, though, that it's not buffering the last N seconds of audio to reprocess once that processor wakes up. Do any/all of the modern smart-speaker devices do that? If so, then you'd have to take into account that when you see the light, you've potentially leaked any secrets you said in the last N seconds as well. Less like a reporter coming in and asking to speak to you; more like an eavesdropper coming in and telling you they heard what you were just saying through the door.
That would require, though, that it's not buffering the last N seconds of audio to reprocess once that processor wakes up. Do any/all of the modern smart-speaker devices do that? If so, then you'd have to take into account that when you see the light, you've potentially leaked any secrets you said in the last N seconds as well. Less like a reporter coming in and asking to speak to you; more like an eavesdropper coming in and telling you they heard what you were just saying through the door.