Skip to main content

Voice (STT & TTS)

OpenHuman has a voice layer so the agent can read text aloud and you can speak instead of type.

Speech-to-Text (STT)

Captures from your microphone on demand
Streams to the backend for transcription
Supports multiple languages

Text-to-Speech (TTS)

Streams generated audio directly to your speakers
Not stored - generated and discarded
Supports multiple voices

Voice settings

From Settings → Voice:

Microphone - select input device
Voice model - choose a voice profile
Language - STT language preference
Wake word - optional "Hey OpenHuman" activation (default off)

Privacy note

Audio buffers are processed locally and not written to disk. See Privacy & Security.

See also

Privacy & Security - Audio data handling
Local AI - Optional on-device voice processing

Speech-to-Text (STT)
Text-to-Speech (TTS)
Voice settings
Privacy note
See also