Perception Assistant — Test Client

No image — upload or capture to set visual context

Idle

Silence threshold 700ms

Energy threshold 8

Streaming ASR Off

In listening mode, utterances auto-send. Use this for image/location-only sends.

Start listening and speak — context frames will appear as you talk.