Agent Observability
This document describes the artifact capture layer that makes the desktop app inspectable by coding agents (Codex, Claude Code, Cursor) via existing WDIO/Appium/tauri-driver tooling.
Quick Start
bash app/scripts/e2e-agent-review.sh
Artifacts are at:
app/test/e2e/artifacts/<ISO-timestamp>-agent-review/
01-welcome.png
01-welcome.source.xml
02-post-onboarding.png
mock-requests-after-onboarding.json
failure-<test>.png # only on failure
meta.json # run metadata + checkpoint index
Components
| Component | Path | Role |
|---|---|---|
| Helper | app/test/e2e/helpers/artifacts.ts | captureCheckpoint, saveMockRequestLog |
| WDIO hook | app/test/wdio.conf.ts (afterTest) | Always dumps screenshot + source on any failing test |
| Spec | app/test/e2e/specs/agent-review.spec.ts | Welcome → onboarding → privacy panel |
| Wrapper script | app/scripts/e2e-agent-review.sh | Build + run + print artifact directory |
Environment Overrides
| Variable | Effect |
|---|---|
E2E_ARTIFACT_DIR | Force specific run directory |
E2E_ARTIFACT_ROOT | Parent directory (default: app/test/e2e/artifacts) |
E2E_ARTIFACT_LABEL | Label (default: agent-review) |
Using the Helper in New Specs
import { captureCheckpoint, saveMockRequestLog } from '../helpers/artifacts';
import { getRequestLog } from '../mock-server';
await captureCheckpoint('after-connect-click');
saveMockRequestLog('after-connect-click', getRequestLog());
Deliberately Out of Scope
- Visual baselines / image diff for each component state
- Screenshots on every click (too noisy)
- Live integrations (Gmail, Notion, Telegram); mock server only
- New test framework / reporter
Only expand to more flows after proving this loop is viable.
Next Steps
- E2E Testing - End-to-end testing
- Testing Strategy - Testing tiers