Skip to main content

Chromium Embedded Framework

OpenHuman does not run the platform's built-in webview. It ships its own Chromium Embedded Framework (CEF) runtime via a fork of tauri-runtime, and this single decision is the load-bearing foundation for nearly every "OpenHuman knows what's happening inside your tools" feature in the product.

Why CEF Instead of Stock Webview

Stock Tauri uses each platform's native webview. WKWebView on macOS, WebView2 on Windows, WebKitGTK on Linux. These are fine for rendering the OpenHuman app itself. But they have a fatal limitation: none of them expose the Chrome DevTools Protocol (CDP).

CDP is the bearer of the primitives. Every "observe what's happening inside Slack / WhatsApp / Telegram / Discord / Meet" feature in OpenHuman talks to those embedded apps via CDP, not via injected JavaScript. CDP provides:

  • Target.getTargets to discover every page and service worker
  • IndexedDB.requestDatabaseNames / requestDatabase / requestData to traverse third-party app local storage
  • DOMSnapshot.captureSnapshot for read-only DOM inspection that won't trigger framework reactivity
  • Runtime.evaluate for ad-hoc one-shot reads

What CEF Is Used for Today

Embedded Third-party Webviews

Each integration provider that runs as a hosted web app has its own sub CEF webview:

  • WhatsApp Web
  • Telegram Web
  • Slack
  • Discord
  • Google Meet
  • LinkedIn
  • Gmail
  • Zoom

CDP-Driven Scanners

Each provider has a scanner module. Each scanner holds a long-lived WebSocket to CEF (--remote-debugging-port=19222) and ticks on a fixed schedule:

ScannerFrequencyWhat It Does
whatsapp_scanner2s DOM tick + 30s full IDB walkReads message store, pulls media metadata
telegram_scannersamePlus QR login handoff to native Telegram Desktop
slack_scanner30s IDB walkPure IDB — no DOM scraping needed
discord_scannerperiodicChannel + DM state via CDP
meet_scannerperiodicReal-time captions + participant state during calls

Google Meet Mascot Camera

The Meet agent doesn't just attend meetings, it broadcasts itself as a camera. This works because CEF allows us to:

  1. Inject a small bridge via Page.addScriptToEvaluateOnNewDocument before any Meet code runs
  2. Override navigator.mediaDevices.getUserMedia to return a MediaStream from a hidden 640×480 canvas
  3. Render the mascot SVG on that canvas, driven from Rust via CDP window.__openhumanSetMood(...)

"No New JS Injection" Rule

The rule is documented in CLAUDE.md: Migrated providers load zero injected JavaScript. All scraping happens natively via CDP on the scanner side.

ProviderMigrated?
WhatsApp✅ Zero JS
Telegram✅ Zero JS
Slack✅ Zero JS
Discord✅ Zero JS
browserscan✅ Zero JS
GmailLegacy runtime.js bridge
LinkedInLegacy LINKEDIN_RECIPE_JS
Google MeetCamera + audio + captions bridge

CEF Prewarming

The hidden CEF webview (cef-prewarm) starts the browser at app launch so the first sub-webview spawns immediately when the user clicks.

Linux Shell Fallback

On some Linux desktops (especially NVIDIA proprietary driver setups), the Tauri/CEF shell may fail during native window configuration. You can continue development by running core and frontend separately when the core itself is healthy:

cargo build --bin openhuman-core
./target/debug/openhuman-core run --port 7788

# Another terminal
cd app
pnpm dev

Open the Vite URL in a regular browser, choose Advanced / remote core mode, and set the RPC URL to http://127.0.0.1:7788/rpc.

Next Steps