1. Executive Summary
hands lets an LLM drive a real machine — execute shell, control mouse/keyboard, capture the screen. That capability makes it, by definition, a high-trust security surface: the same design that makes it useful lets it do damage if a control fails. The team clearly knows this — there's genuine defense-in-depth (a code-level command guardrail, an append-only audit log, a true dry-run mode, a per-run budget cap, a 0700 check on the secrets directory). That's materially better than most agent code.
Overall risk was MODERATE, concentrated in one place: the safety control for shell execution is a regex denylist, and one code path reached a shell around it with a command-injection weakness. We fixed the injection immediately; the denylist→allowlist hardening is the recommended next investment.
2. Architecture
A dual-mode agent with a small, legible core. SDK mode runs the computer-use loop directly and exposes five tools (computer, bash, file-editor, plus two custom tools — read_page / find_files — that exist specifically to collapse many risky bash turns into one). CLI mode delegates to the installed claude binary. A platform layer isolates OS-native input behind a clean cross-platform seam. The MCP server is stdio-only — no listening port, no network attack surface (the right call). And the safety/observability layer is real: a per-run budget cap enforced each turn, an append-only audit log of every tool call, and a dry-run mode that stubs every side effect while keeping the loop coherent.
3. Security Posture
The realistic adversary isn't a malicious operator — it's the model making a mistake, or prompt injection (hands reads web pages and screens, so attacker-controlled text enters the context and can try to steer tool calls). The control that matters is the gate between "model says" and "machine does."
That gate is a regex denylist (guardrails.ts): it hard-blocks catastrophic literals (root-filesystem deletion, disk format, forced registry deletion) and warns on others. Sound as one layer — but a denylist is the wrong shape for an arbitrary-shell executor: it's bypassable by obfuscation, and it only hard-blocks root-scope destruction, not deletion scoped to the user's home or project. More importantly, one path reached a shell without passing through it at all — see P1.
4. Findings
The file-editor's view operation built a shell command by interpolating a model-supplied path and running it through the shell — without the command guardrail. A path containing shell metacharacters escaped the intended command. Model-reachable (no malicious user needed); prompt-injected page/screen content is a plausible trigger.
Remediation (shipped, PR #58, CI-green): reimplemented the editor (view / create / str_replace / insert) directly on the filesystem — no shell, nothing to inject into — and completed the edit operations, which had previously been silent no-ops.
Bypassable by obfuscation; only hard-blocks root-scope destruction (not home/project); several dangerous patterns warn-but-allow, which in an autonomous loop means they still run. Recommendation: invert toward an allowlist (or confirm-on-anything-not-allowlisted), treat home/project-scope destructive operations as confirm-required, and make that one gate the single chokepoint for every shell-reaching path.
The editor's edit operations returned a "handled via bash" stub and did nothing (the model believed it had edited a file). And express was a declared dependency with no import anywhere in the source. Both addressed in PR #58 — the editor is now complete, and the redundant direct dependency is removed.
5. Dependencies & Supply-Chain
Lean and lockfile-pinned, on current majors. The set is defensible for the feature surface (with the now-removed express). cheerio parses untrusted fetched HTML — it correctly feeds the model as text, not as DOM. Recommended standing control: an npm audit gate in CI.
6. What's Working — Keep It
- Defense-in-depth instinct — code guardrail + prompt guardrail + audit log + dry-run + budget cap is a strong posture for an agent that touches a real machine.
- Append-only audit log of every tool call (args, timing, outcome) — exactly what you want here.
- True dry-run mode — stubs every side effect while keeping the loop coherent.
- stdio-only MCP server — no network listener, minimal tool surface.
read_page/find_filescustom tools — collapse many shell turns into one; cheaper, faster, lower-risk.
7. Prioritized Remediation Roadmap
- Done (PR #58): shell-free editor (P1), complete edit ops + drop unused dep (P3).
- Next: invert the guardrail to allowlist/confirm; route every shell-reaching path through it; treat home/project-scope destruction as confirm-required (P2).
- Then: regression tests asserting the guardrail blocks each intended pattern and the editor rejects path metacharacters; add an
npm auditCI gate.