Local AI (optional)

OpenHuman can run a local model on your machine for workloads where keeping data on-device matters: memory embeddings, summary-tree building, background reasoning loops, and explicitly routed chat or reasoning workloads. It is opt-in and ships off by default.

What runs local when you turn it on

Workload	Default model
Memory embeddings	`all-minilm:latest`
Summary-tree building	`gemma3:1b-it-qat`
Heartbeat loop	Small chat model
Learning / reflection	Small chat model
Subconscious	Small chat model

What stays in the cloud

Workload	Why
Chat	Frontier reasoning quality unless configured otherwise
Reasoning	Stronger multi-step quality
Vision	Requires more compute
STT / TTS	Backend-proxied

How it works

OpenHuman supports two local provider paths:

Ollama - for bundled model lifecycle and embeddings
LM Studio - through its local OpenAI-compatible server

For Ollama, OpenHuman talks to its OpenAI-compatible /v1 endpoint. If Ollama is not reachable, requests transparently fall back to the remote provider.

Opting in

In the desktop app: Settings → AI & Skills → Local AI

You can choose presets:

"Embeddings only"
"Memory + reflection"
"Everything local"

What you'll need

Ollama or LM Studio installed and running locally
Enough disk for models (~700 MB for gemma3, ~23 MB for all-minilm)
8 GB+ RAM recommended, 16 GB+ ideal

What runs local when you turn it on​

What stays in the cloud​

How it works​

Opting in​

What you'll need​

See also​

What runs local when you turn it on

What stays in the cloud

How it works

Opting in

What you'll need

See also