Voice (STT & TTS)
OpenHuman has a voice layer so the agent can read text aloud and you can speak instead of type.
Speech-to-Text (STT)
- Captures from your microphone on demand
- Streams to the backend for transcription
- Supports multiple languages
Text-to-Speech (TTS)
- Streams generated audio directly to your speakers
- Not stored - generated and discarded
- Supports multiple voices
Voice settings
From Settings → Voice:
- Microphone - select input device
- Voice model - choose a voice profile
- Language - STT language preference
- Wake word - optional "Hey OpenHuman" activation (default off)
Privacy note
Audio buffers are processed locally and not written to disk. See Privacy & Security.
See also
- Privacy & Security - Audio data handling
- Local AI - Optional on-device voice processing