Perception Assistant

Disconnected

Image Input (sticky context)

No image — upload or capture to set visual context

Location (sticky context)

Voice (continuous listening)

Idle
700ms
8

Manual Send

In listening mode, utterances auto-send. Use this for image/location-only sends.

Log

Start listening and speak — context frames will appear as you talk.