First Release of Claw3D (#11)
Co-authored-by: iamlukethedev <iamlukethedev@users.noreply.github.com>
This commit is contained in:
@@ -0,0 +1,350 @@
|
||||
# Permissions, Sandboxing, and Workspaces (Studio -> Gateway -> PI)
|
||||
|
||||
This document exists to onboard coding agents quickly when debugging:
|
||||
- Why an agent can or cannot read/write files
|
||||
- Why command execution requires approvals (or not)
|
||||
- Why a sandboxed run behaves differently from a non-sandboxed run
|
||||
- How “create agent” choices in **Claw3D** flow into the **OpenClaw Gateway** (often running on an EC2 host) where enforcement actually happens
|
||||
|
||||
Scope:
|
||||
- Studio one-step agent creation and post-create authority updates, including exact gateway calls.
|
||||
- The upstream OpenClaw implementation that persists and enforces those settings at runtime.
|
||||
|
||||
Non-scope:
|
||||
- Full PI internal reasoning/toolchain. Studio does not implement PI logic; it configures and displays the Gateway session.
|
||||
- Any private EC2 runbook or SSH/hostnames. Keep this doc repo-safe.
|
||||
|
||||
## Mental Model (First Principles)
|
||||
|
||||
Studio is a UI + proxy. It does two things related to “permissions”:
|
||||
1. Writes **configuration** into the Gateway (per-agent overrides in `openclaw.json`).
|
||||
2. Writes **policy** into the Gateway (per-agent exec approvals in `exec-approvals.json`).
|
||||
|
||||
The Gateway (OpenClaw) is the enforcement point:
|
||||
- It decides whether a session is sandboxed.
|
||||
- It decides which workspace is mounted into the sandbox.
|
||||
- It constructs the PI toolset (read/write/edit/apply_patch/exec/etc) based on config + sandbox context.
|
||||
- It asks for exec approvals when policy requires it and broadcasts approval events.
|
||||
|
||||
## Glossary
|
||||
|
||||
- **Gateway**: OpenClaw Gateway WebSocket server (upstream project).
|
||||
- **Studio**: this repo. Next.js UI plus a Node WS proxy.
|
||||
- **Agent**: an OpenClaw agent entry stored in gateway config (`agents.list[]`).
|
||||
- **Session key**: OpenClaw session identifier. Studio uses `agent:<agentId>:<mainKey>` for the agent’s “main” session.
|
||||
- **Agent workspace**: a directory on the Gateway host filesystem configured per-agent (where bootstrap files and edits live).
|
||||
- **Sandbox workspace**: a separate directory used when a session is sandboxed and `workspaceAccess` is not `rw`.
|
||||
- **Sandbox mode** (`sandbox.mode`): when to sandbox (`off`, `non-main`, `all`).
|
||||
- **Workspace access** (`sandbox.workspaceAccess`): how the sandbox relates to the agent workspace (`none`, `ro`, `rw`).
|
||||
- **Tool policy** (`tools.profile`, `tools.alsoAllow`, `tools.deny`): allow/deny gating for PI tools (OpenClaw resolves effective policy).
|
||||
- **Exec approvals policy**: per-agent `{ security, ask, allowlist }` stored in exec approvals file; drives “Allow once / Always allow / Deny” UX.
|
||||
|
||||
## Studio: Where “Permissions” Are Chosen
|
||||
|
||||
Agent creation is intentionally lightweight:
|
||||
- `src/features/agents/components/AgentCreateModal.tsx` captures `name` and optional avatar shuffle seed.
|
||||
- `src/features/agents/operations/mutationLifecycleWorkflow.ts` applies queue/guard behavior and calls create.
|
||||
- `src/lib/gateway/agentConfig.ts` (`createGatewayAgent`) performs `config.get` + `agents.create`.
|
||||
|
||||
After creation, Studio applies a permissive default capability envelope:
|
||||
- Commands: `Auto`
|
||||
- Web access: `On`
|
||||
- File tools: `On`
|
||||
|
||||
Implementation:
|
||||
- `src/app/page.tsx` (`handleCreateAgentSubmit`) applies `CREATE_AGENT_DEFAULT_PERMISSIONS`.
|
||||
- `src/features/agents/operations/agentPermissionsOperation.ts` (`updateAgentPermissionsViaStudio`) persists those defaults.
|
||||
|
||||
Further capability changes happen from the `Capabilities` tab:
|
||||
- `src/features/agents/operations/agentPermissionsOperation.ts` (`updateAgentPermissionsViaStudio`)
|
||||
- updates per-agent exec approvals (`exec.approvals.get` + `exec.approvals.set`)
|
||||
- updates tool-group overrides for runtime, web, and file access (`config.get` + `config.patch` via `updateGatewayAgentOverrides`)
|
||||
- updates session exec behavior (`sessions.patch` via `syncGatewaySessionSettings`)
|
||||
|
||||
### Runtime Tool Groups Used By Capability Updates
|
||||
|
||||
Studio capability updates rely on OpenClaw tool-group expansion (`openclaw/src/agents/tool-policy.ts`), especially:
|
||||
- `group:runtime` -> runtime execution tools (`exec`, `process`)
|
||||
|
||||
Internal mapping detail:
|
||||
- Command mode `off|ask|auto` maps to role logic (`conservative|collaborative|autonomous`) for policy generation.
|
||||
- UI exposes direct capability controls, not role labels.
|
||||
|
||||
## Studio -> Gateway: “Create Agent” End-to-End
|
||||
|
||||
Primary entry points:
|
||||
- `src/features/agents/operations/mutationLifecycleWorkflow.ts`
|
||||
- `src/lib/gateway/agentConfig.ts` (`createGatewayAgent`)
|
||||
|
||||
Sequence:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant UI as Studio UI
|
||||
participant L as Create lifecycle
|
||||
participant GC as Studio GatewayClient
|
||||
participant G as OpenClaw Gateway
|
||||
|
||||
UI->>L: submit({ name, avatarSeed? })
|
||||
L->>GC: createGatewayAgent(name)
|
||||
GC->>G: config.get
|
||||
G-->>GC: { path: ".../openclaw.json", ... }
|
||||
GC->>G: agents.create({ name, workspace: "<stateDir>/workspace-<slug>" })
|
||||
G-->>GC: { agentId, workspace }
|
||||
L-->>UI: completion(agentId)
|
||||
```
|
||||
|
||||
### How Studio Chooses the Default Workspace Path
|
||||
|
||||
Studio computes a default workspace path from the gateway’s config path:
|
||||
- `src/lib/gateway/agentConfig.ts` (`createGatewayAgent`)
|
||||
|
||||
Logic:
|
||||
1. Call `config.get` and read `snapshot.path` (the gateway host config path).
|
||||
2. Compute `stateDir = dirname(configPath)`.
|
||||
3. Compute `workspace = join(stateDir, "workspace-" + slugify(name))`.
|
||||
4. Call `agents.create({ name, workspace })`.
|
||||
|
||||
Important: for a remote gateway (EC2), that `workspace` path refers to the gateway host filesystem, not your laptop.
|
||||
|
||||
## Studio: Sandbox Env Allowlist Sync (Current Scope)
|
||||
|
||||
Create flow does not perform setup writes during initial create anymore. If Studio needs to ensure sandbox env allowlist entries, that behavior should be attached to explicit settings/config operations rather than create-time side effects.
|
||||
|
||||
## OpenClaw (Upstream): What `agents.create` Actually Does
|
||||
|
||||
Gateway method:
|
||||
- `openclaw/src/gateway/server-methods/agents.ts` (`"agents.create"`)
|
||||
|
||||
Key behaviors:
|
||||
- Normalizes `agentId` from the provided `name` (and reserves `"default"`).
|
||||
- Uses the provided `workspace` and resolves it to an absolute path.
|
||||
- Writes a config entry for the agent (including the workspace dir and agent dir).
|
||||
- Ensures the workspace directory exists and that bootstrap files exist (unless `agents.defaults.skipBootstrap` is set).
|
||||
- Ensures the session transcripts directory exists for the agent.
|
||||
- Writes the config file only after those directories exist (to avoid persisting a broken agent entry).
|
||||
- Appends `- Name: ...` (and optional emoji/avatar) to `IDENTITY.md` in the workspace.
|
||||
|
||||
So: the “workspace” is not a UI-only concept; it is a real directory created on the Gateway host.
|
||||
|
||||
## OpenClaw (Upstream): Sandbox Semantics
|
||||
|
||||
Sandbox configuration resolution:
|
||||
- `openclaw/src/agents/sandbox/config.ts` (`resolveSandboxConfigForAgent`)
|
||||
|
||||
Sandbox context creation (where workspace selection happens):
|
||||
- `openclaw/src/agents/sandbox/context.ts` (`resolveSandboxContext`)
|
||||
|
||||
Docker mount behavior:
|
||||
- `openclaw/src/agents/sandbox/docker.ts` (`createSandboxContainer`)
|
||||
|
||||
### Sandbox Mode (`sandbox.mode`)
|
||||
|
||||
Modes (as implemented upstream):
|
||||
- `off`: sessions are not sandboxed.
|
||||
- `all`: every session is sandboxed.
|
||||
- `non-main`: sandbox all sessions except the agent’s main session key.
|
||||
|
||||
The “main session key” comparison is done against the configured main key, with alias-canonicalization:
|
||||
- Upstream canonicalizes the session key before comparing so that main-session aliases are treated as “main” (see `canonicalizeMainSessionAlias` in upstream sandbox runtime-status).
|
||||
- If `session.scope` is `global`, the main session key is `global` and `non-main` effectively means “sandbox everything except the global session”.
|
||||
|
||||
Upstream implementation reference:
|
||||
- `openclaw/src/agents/sandbox/runtime-status.ts` (`resolveSandboxRuntimeStatus`)
|
||||
|
||||
### Sandbox Scope (`sandbox.scope`)
|
||||
|
||||
Sandbox scope controls how sandboxes are shared and therefore what persists between runs:
|
||||
- `session`: per-session sandbox workspace/container (highest isolation, most churn)
|
||||
- `agent`: per-agent sandbox workspace/container keyed by agent id (shared across that agent’s sessions)
|
||||
- `shared`: one sandbox workspace/container shared across everything (lowest isolation)
|
||||
|
||||
Upstream implementation reference:
|
||||
- `openclaw/src/agents/sandbox/types.ts` (`SandboxScope`)
|
||||
- `openclaw/src/agents/sandbox/shared.ts` (`resolveSandboxScopeKey`)
|
||||
|
||||
### Workspace Access (`sandbox.workspaceAccess`)
|
||||
|
||||
Upstream behavior (important):
|
||||
- `rw`:
|
||||
- The sandbox uses the **agent workspace** as the sandbox root.
|
||||
- PI filesystem tools (`read`/`write`/`edit`/`apply_patch`) operate on the agent workspace.
|
||||
- `ro`:
|
||||
- The sandbox uses a **sandbox workspace** as the sandbox root (writable sandbox dir).
|
||||
- The real agent workspace is mounted at `/agent` read-only for command-line inspection.
|
||||
- PI filesystem tools are additionally restricted: upstream disables write/edit/apply_patch in this mode (see below).
|
||||
- `none`:
|
||||
- The sandbox uses a **sandbox workspace** as the sandbox root.
|
||||
- The agent workspace is not mounted into the container.
|
||||
|
||||
Sandbox workspace root default:
|
||||
- `openclaw/src/agents/sandbox/constants.ts` uses `<STATE_DIR>/sandboxes` (where `STATE_DIR` defaults to `~/.openclaw` unless overridden by `OPENCLAW_STATE_DIR`).
|
||||
|
||||
Sandbox workspace seeding:
|
||||
- When using a sandbox workspace root, upstream seeds missing bootstrap files from the agent workspace and ensures bootstrap exists:
|
||||
- `openclaw/src/agents/sandbox/workspace.ts` (`ensureSandboxWorkspace`)
|
||||
- The sandbox workspace also syncs skills from the agent workspace (best-effort) in `resolveSandboxContext`.
|
||||
|
||||
### Hard Enforcement: Filesystem Tool Root Guard
|
||||
|
||||
In upstream OpenClaw, sandboxed filesystem tools are rooted and guarded:
|
||||
- `openclaw/src/agents/pi-tools.read.ts` (`assertSandboxPath` usage)
|
||||
|
||||
Result:
|
||||
- `read`/`write`/`edit` tools cannot access paths outside the sandbox root, even if the container has other mounts (like `/agent`).
|
||||
|
||||
This is intentional: the “filesystem tools” and “exec tool” have different access characteristics inside a sandbox.
|
||||
|
||||
## Sandbox Tool Policy (Separate From Per-Agent Tool Overrides)
|
||||
|
||||
OpenClaw has an additional sandbox-only tool allow/deny policy:
|
||||
- `tools.sandbox.tools.allow|deny` (global)
|
||||
- `agents.list[].tools.sandbox.tools.allow|deny` (per-agent override)
|
||||
|
||||
Upstream resolution:
|
||||
- `openclaw/src/agents/sandbox/tool-policy.ts` (`resolveSandboxToolPolicyForAgent`)
|
||||
|
||||
Important nuance:
|
||||
- If `tools.sandbox.tools.allow` is present and non-empty, it becomes an allowlist.
|
||||
- If it is set to an empty array, upstream will still auto-add `image` to the allowlist (unless explicitly denied), which often turns “empty” into effectively “image-only”.
|
||||
- If you want “allow everything” semantics in sandbox policy, prefer `["*"]` over `[]` to avoid the image auto-add corner case.
|
||||
|
||||
This is why Studio treats some configs as “broken” and repairs them (see below).
|
||||
|
||||
### Policy Layering (Why “Allowed” Can Still Be Blocked)
|
||||
|
||||
In a sandboxed session, a tool must pass multiple gates:
|
||||
- The normal tool policy gates (`tools.profile`, `tools.allow|alsoAllow`, `tools.deny`, plus any provider/group/subagent policies upstream applies).
|
||||
- The sandbox tool policy gate (`tools.sandbox.tools.allow|deny` resolved for that agent).
|
||||
|
||||
So even if Studio enables `group:runtime` for an agent, the tool can still be blocked in sandboxed sessions if sandbox tool policy denies it.
|
||||
|
||||
## OpenClaw (Upstream): Tool Availability and `workspaceAccess=ro`
|
||||
|
||||
PI tool construction:
|
||||
- `openclaw/src/agents/pi-tools.ts` (`createOpenClawCodingTools`)
|
||||
|
||||
Key enforcement:
|
||||
- When sandboxed, upstream removes the normal host `write`/`edit` tools.
|
||||
- It only adds sandboxed `write`/`edit` tools if `workspaceAccess !== "ro"`.
|
||||
- It disables `apply_patch` in sandbox when `workspaceAccess === "ro"`.
|
||||
|
||||
This is why “`workspaceAccess=ro`” means more than “mount it read-only”:
|
||||
- It is also a tool-policy gate that prevents direct file writes/edits through PI tools.
|
||||
|
||||
### Studio Note: Authority Is No Longer Compiled During Create
|
||||
|
||||
Studio create flow no longer compiles authority/sandbox settings during initial create.
|
||||
|
||||
When capabilities are changed post-create, Studio uses:
|
||||
- `src/features/agents/operations/agentPermissionsOperation.ts` (`updateAgentPermissionsViaStudio`)
|
||||
|
||||
That operation updates:
|
||||
- exec approvals policy (`exec.approvals.set`)
|
||||
- per-agent tool overrides (`config.patch` via `updateGatewayAgentOverrides`)
|
||||
- session exec host/security/ask (`sessions.patch`)
|
||||
|
||||
Upstream enforcement is unchanged: `workspaceAccess="ro"` still disables PI `write`/`edit`/`apply_patch` in sandboxed sessions.
|
||||
|
||||
## Session-Level Exec Settings (Where `exec` Runs)
|
||||
|
||||
Separately from per-agent config and exec approvals, OpenClaw supports per-session exec settings:
|
||||
- `execHost`: `sandbox | gateway | node`
|
||||
- `execSecurity`: `deny | allowlist | full`
|
||||
- `execAsk`: `off | on-miss | always`
|
||||
|
||||
These are stored in the gateway session store and mutated with `sessions.patch`:
|
||||
- Upstream method: `openclaw/src/gateway/server-methods/sessions.ts` (`"sessions.patch"`)
|
||||
- Patch application: `openclaw/src/gateway/sessions-patch.ts`
|
||||
- Session entry shape includes `execHost|execSecurity|execAsk`: `openclaw/src/config/sessions/types.ts`
|
||||
|
||||
Studio uses these fields to keep “what the UI expects” aligned with gateway runtime:
|
||||
- Hydration derives the expected values using the exec approvals policy plus sandbox mode:
|
||||
- `src/features/agents/operations/agentFleetHydrationDerivation.ts`
|
||||
- Special case: if `sandbox.mode === "all"` and there are exec overrides, Studio forces `execHost = "sandbox"` to avoid accidentally running on the host.
|
||||
- On first send (or when out of sync), Studio patches the session:
|
||||
- `src/features/agents/operations/chatSendOperation.ts` calls `syncGatewaySessionSettings(...)`
|
||||
- Transport: `src/lib/gateway/GatewayClient.ts` (`sessions.patch`)
|
||||
|
||||
Net effect:
|
||||
- Exec approvals policy controls whether the user will be prompted to approve.
|
||||
- Session exec settings control where execution happens (sandbox vs host) and the default `security/ask` values for runs.
|
||||
|
||||
## OpenClaw (Upstream): Exec Approvals (Policy + Events)
|
||||
|
||||
Exec approvals file (defaults upstream):
|
||||
- `openclaw/src/infra/exec-approvals.ts`
|
||||
- default file path: `~/.openclaw/exec-approvals.json`
|
||||
- default socket path: `~/.openclaw/exec-approvals.sock`
|
||||
|
||||
Gateway methods (persist policy):
|
||||
- `openclaw/src/gateway/server-methods/exec-approvals.ts`
|
||||
- `exec.approvals.get` returns `{ path, exists, hash, file }` (socket token is redacted in responses)
|
||||
- `exec.approvals.set` requires a matching `baseHash` when the file already exists (prevents lost updates)
|
||||
|
||||
Approval request/resolve + broadcast events:
|
||||
- `openclaw/src/gateway/server-methods/exec-approval.ts`
|
||||
- broadcasts `exec.approval.requested`
|
||||
- broadcasts `exec.approval.resolved`
|
||||
|
||||
Exec tool approval decision logic:
|
||||
- `openclaw/src/agents/bash-tools.exec.ts` (calls `requiresExecApproval`, `evaluateShellAllowlist`, etc.)
|
||||
|
||||
Studio wiring for policy persistence:
|
||||
- Studio writes per-agent policy with `exec.approvals.set`:
|
||||
- `src/lib/gateway/execApprovals.ts` (`upsertGatewayAgentExecApprovals`)
|
||||
|
||||
Studio wiring for UX:
|
||||
- Studio listens to `exec.approval.requested` and `exec.approval.resolved` and renders in-chat approval cards.
|
||||
- When the user clicks approve/deny, Studio calls `exec.approval.resolve`.
|
||||
|
||||
## Debug Checklist (When Something Feels “Wrong”)
|
||||
|
||||
1. Determine if the session is sandboxed and what workspace it is using.
|
||||
- Upstream CLI helper: `openclaw sandbox explain --agent <agentId>` (see upstream `src/commands/sandbox-explain.ts`)
|
||||
2. Confirm what Studio wrote:
|
||||
- Agent overrides: `config.get` and inspect `agents.list[]` entry for the agent.
|
||||
- Exec approvals: `exec.approvals.get` and inspect `file.agents[agentId]`.
|
||||
3. If file edits are not happening:
|
||||
- Check `sandbox.workspaceAccess` (if `ro`, upstream disables write/edit/apply_patch tools in sandbox).
|
||||
- Check tool policy (`tools.profile`, `tools.alsoAllow`, `tools.deny`) for explicit denies on `write`/`edit`/`apply_patch`.
|
||||
4. If approvals are not showing up:
|
||||
- Check exec approvals `security` + `ask`.
|
||||
- Check allowlist patterns (a match may suppress prompts when `ask=on-miss`).
|
||||
5. If the agent can see different files than expected:
|
||||
- `workspaceAccess=rw` means “tools operate on the agent workspace”.
|
||||
- `workspaceAccess=ro|none` means “tools operate on a sandbox workspace”.
|
||||
- `/agent` mount exists only for `workspaceAccess=ro` and is accessible via sandbox exec, not via filesystem tools.
|
||||
|
||||
## Studio Post-Create “Permissions” Flows (Not Just Creation)
|
||||
|
||||
Studio can also change permissions after an agent exists.
|
||||
|
||||
### Capabilities Permissions Updates
|
||||
|
||||
Studio’s permissions flow applies coordinated changes from one save action:
|
||||
- Exec approvals policy (per-agent, persisted in exec approvals file)
|
||||
- Tool allow/deny for runtime/web/fs groups (`group:runtime`, `group:web`, `group:fs`) in agent config
|
||||
- Session exec settings (`execHost|execSecurity|execAsk`) via `sessions.patch`
|
||||
|
||||
Code:
|
||||
- `src/features/agents/operations/agentPermissionsOperation.ts` (`updateAgentPermissionsViaStudio`)
|
||||
|
||||
UI model:
|
||||
- Direct controls: `Command mode` (`Off`/`Ask`/`Auto`), `Web access` (`Off`/`On`), `File tools` (`Off`/`On`)
|
||||
- Create modal remains permission-light (name/avatar only) and create flow immediately applies permissive defaults (`Auto`, web on, file tools on).
|
||||
|
||||
Why it matters:
|
||||
- You can have exec approvals configured but still be unable to run commands if `group:runtime` is denied.
|
||||
- You can have permissive approvals but still be safe if `execHost` is forced to `sandbox` when sandboxing is enabled.
|
||||
|
||||
### One-Shot Sandbox Tool Policy Repair
|
||||
|
||||
On connect, Studio scans the gateway config for agents that are sandboxed (`sandbox.mode === "all"`) and have an explicitly empty sandbox allowlist (`tools.sandbox.tools.allow = []`), and repairs those entries by setting:
|
||||
- `agents.list[].tools.sandbox.tools.allow = ["*"]`
|
||||
|
||||
Code:
|
||||
- Detection + repair enqueue: `src/app/page.tsx` (`repair-sandbox-tool-allowlist`)
|
||||
- Gateway write: `src/lib/gateway/agentConfig.ts` (`updateGatewayAgentOverrides`)
|
||||
|
||||
This exists to prevent sandboxed sessions from effectively losing access to almost all sandbox tools due to an empty allowlist interacting with upstream sandbox tool-policy behavior.
|
||||
@@ -0,0 +1,373 @@
|
||||
# PI + Chat Streaming (Studio Side)
|
||||
|
||||
This document exists to onboard coding agents quickly when debugging chat issues in Claw3D.
|
||||
|
||||
Scope:
|
||||
- Describes how Studio connects to the OpenClaw Gateway, how runtime streaming arrives over WebSockets, and how the UI renders it.
|
||||
- Treats **PI** as “the coding agent running behind the Gateway” (an OpenClaw agent). Studio does not implement PI logic; it displays and controls the Gateway session.
|
||||
|
||||
Non-scope:
|
||||
- PI internals and model/tool execution details. Those live in the OpenClaw repository and the Gateway implementation.
|
||||
|
||||
## Key Files (Start Here)
|
||||
|
||||
- Studio server entry + upgrade wiring: `server/index.js`
|
||||
- Browser WS bridge to upstream gateway: `server/gateway-proxy.js`
|
||||
- Browser WS URL (always same-origin `/api/gateway/ws`): `src/lib/gateway/proxy-url.ts`
|
||||
- Browser gateway protocol client (vendored): `src/lib/gateway/openclaw/GatewayBrowserClient.ts`
|
||||
- Studio gateway wrapper + connect policy: `src/lib/gateway/GatewayClient.ts`
|
||||
- Runtime stream classification and merge helpers: `src/features/agents/state/runtimeEventBridge.ts`
|
||||
- Runtime event executor (streaming -> state -> transcript lines): `src/features/agents/state/gatewayRuntimeEventHandler.ts`
|
||||
- Chat rendering: `src/features/agents/components/AgentChatPanel.tsx`, `src/features/agents/components/chatItems.ts`
|
||||
- Message parsing (text/thinking/tool markers): `src/lib/text/message-extract.ts`
|
||||
- History sync + transcript merge: `src/features/agents/operations/historySyncOperation.ts`, `src/features/agents/state/transcript.ts`
|
||||
|
||||
## Relationship To OpenClaw (What’s Vendored Here)
|
||||
|
||||
Studio vendors the browser Gateway client used to speak the Gateway protocol:
|
||||
- Vendored client: `src/lib/gateway/openclaw/GatewayBrowserClient.ts`
|
||||
- Sync script: `scripts/sync-openclaw-gateway-client.ts`
|
||||
- Sync source: provide an explicit local source path to the sync script via CLI arg or env var.
|
||||
|
||||
Important:
|
||||
- Studio does not currently auto-sync `GatewayBrowserClient.ts` from a fixed maintainer-local checkout path.
|
||||
- If protocol mismatch is suspected, first verify the sync source file and the upstream Gateway runtime/protocol files are aligned.
|
||||
|
||||
If a protocol mismatch is suspected (missing event fields, renamed streams, different error codes), start by checking whether Studio’s vendored client is in sync with the Gateway version you’re running.
|
||||
|
||||
## Upstream Source Of Truth (OpenClaw)
|
||||
|
||||
For chat streaming behavior, these upstream files are authoritative:
|
||||
- `src/gateway/protocol/schema/logs-chat.ts` in your OpenClaw checkout (`chat.send`, `chat.history`, and chat event schema)
|
||||
- `src/gateway/server-methods/chat.ts` in your OpenClaw checkout (`chat.send` ack + idempotency, `chat.history` payload shaping/sanitization)
|
||||
- `src/gateway/server-chat.ts` in your OpenClaw checkout (`agent` event fanout and synthetic `chat` delta/final bridging)
|
||||
- `src/agents/pi-embedded-subscribe.ts` and related handlers in your OpenClaw checkout (`assistant`/`tool`/`lifecycle` stream emission)
|
||||
|
||||
When updating this doc, verify behavior against those files, not assumptions.
|
||||
|
||||
## Terminology
|
||||
|
||||
- Studio: this repo, a Next.js UI with a custom Node server.
|
||||
- Gateway (upstream): the OpenClaw Gateway WebSocket server (default `ws://localhost:18789`).
|
||||
- WS bridge / proxy: Studio’s server-side WebSocket that bridges the browser to the upstream Gateway.
|
||||
- Frame: JSON message over WebSocket (request/response/event).
|
||||
- Run: a single streamed execution identified by `runId`.
|
||||
- Session: identified by `sessionKey` (Studio uses `agent:<agentId>:<mainKey>` for main sessions).
|
||||
|
||||
## High-Level Network Path
|
||||
|
||||
There are two separate WebSocket hops, plus a protocol-level `connect` request:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant B as Browser (Studio UI)
|
||||
participant S as Studio server (WS proxy)
|
||||
participant G as OpenClaw Gateway (upstream)
|
||||
|
||||
B->>S: WS connect /api/gateway/ws
|
||||
B->>S: req(connect) (Gateway protocol frame)
|
||||
S->>G: WS connect upstream (url from settings.json)
|
||||
S->>G: req(connect) (injects token if missing)
|
||||
G-->>S: res(connect)
|
||||
S-->>B: res(connect)
|
||||
G-->>S: event(chat/agent/presence/heartbeat)
|
||||
S-->>B: event(...)
|
||||
```
|
||||
|
||||
Files:
|
||||
- WS proxy entrypoint: `server/index.js`
|
||||
- WS proxy implementation: `server/gateway-proxy.js`
|
||||
|
||||
Notes:
|
||||
- The browser never opens a WebSocket directly to the upstream Gateway URL. The browser always speaks to the Studio same-origin bridge at `/api/gateway/ws` (computed by `src/lib/gateway/proxy-url.ts`).
|
||||
- The “upstream gateway URL” shown in Studio settings is used by the Studio server (the proxy) to open the upstream connection.
|
||||
|
||||
## End-To-End Flow (PI Run -> UI)
|
||||
|
||||
This is the “happy path” you want in your head when debugging:
|
||||
|
||||
1. User types in the chat composer and hits Send (`src/features/agents/components/AgentChatPanel.tsx`).
|
||||
2. Studio calls `chat.send` with `sessionKey` and `idempotencyKey = runId` (`src/features/agents/operations/chatSendOperation.ts`).
|
||||
3. Gateway runs the agent (PI) for that session.
|
||||
4. While the run is executing, the Gateway may stream:
|
||||
- `event: "agent"` frames for live partial output (`stream: "assistant"`), live thinking (`reason*`/`think*` streams), tool calls/results (`stream: "tool"`), and lifecycle (`stream: "lifecycle"`).
|
||||
- `event: "chat"` frames for the chat message stream (`state: "delta" | "final" | ...`).
|
||||
- Both streams can describe the same run progression from different layers (`agent` stream events and `chat` message events), so Studio must merge idempotently.
|
||||
5. Studio merges those events into:
|
||||
- live fields (`streamText`, `thinkingTrace`) via batched `queueLivePatch` (fast UI updates without committing to the transcript yet)
|
||||
- committed transcript lines (`outputLines`) via `appendOutput` (final messages, tool lines, meta/timestamp, thinking trace)
|
||||
6. The chat panel renders:
|
||||
- historical transcript from `outputLines`
|
||||
- an extra “live assistant” card at the bottom built from `streamText` + `thinkingTrace` while `status === "running"`.
|
||||
|
||||
The key wiring is in:
|
||||
- Event subscription + dispatch: `src/app/page.tsx`
|
||||
- Runtime event handler: `src/features/agents/state/gatewayRuntimeEventHandler.ts`
|
||||
- Store reducer: `src/features/agents/state/store.tsx`
|
||||
|
||||
## Studio Settings (Where Gateway URL/Token Come From)
|
||||
|
||||
Studio persists Gateway connection settings on the Studio host (not in browser persistent storage). The UI still loads them into browser memory at runtime:
|
||||
- `~/.openclaw/claw3d/settings.json` (see `README.md` for the canonical location)
|
||||
|
||||
The WS proxy loads these settings server-side and opens the upstream connection.
|
||||
|
||||
Files:
|
||||
- Settings file access (WS proxy): `server/studio-settings.js`
|
||||
- Settings API route (browser -> server): `src/app/api/studio/route.ts`
|
||||
- Client-side load/patch coordinator: `src/lib/studio/coordinator.ts`
|
||||
- Settings storage + fallback behavior used by `/api/studio`: `src/lib/studio/settings-store.ts`
|
||||
|
||||
Connection note:
|
||||
- In the browser, `useGatewayConnection()` stores the upstream URL/token in memory (loaded from `/api/studio`) but connects the WebSocket to Studio via `resolveStudioProxyGatewayUrl()`; the upstream URL is passed as `authScopeKey` (not as the WebSocket URL). See `src/lib/gateway/GatewayClient.ts`.
|
||||
|
||||
Token resolution note:
|
||||
- The Studio server resolves an upstream token from `claw3d/settings.json`, and if it is missing it may fall back to the local OpenClaw config in `openclaw.json` (token + port). This behavior exists in both the WS proxy path (`server/studio-settings.js`) and the `/api/studio` storage layer (`src/lib/studio/settings-store.ts`) and they should remain consistent.
|
||||
- During `connect`, the WS proxy forwards browser-provided auth (`params.auth.token` or `params.device.signature`) as-is. It injects the host-resolved token only when browser auth is absent. `studio.gateway_token_missing` is returned only when neither browser auth nor host token is available.
|
||||
|
||||
## WebSocket Frame Shapes
|
||||
|
||||
Studio expects Gateway frames shaped like:
|
||||
|
||||
```json
|
||||
{ "type": "req", "id": "uuid", "method": "connect", "params": { } }
|
||||
{ "type": "res", "id": "uuid", "ok": true, "payload": { } }
|
||||
{ "type": "res", "id": "uuid", "ok": false, "error": { "code": "…", "message": "…" } }
|
||||
{ "type": "event", "event": "chat", "payload": { } }
|
||||
```
|
||||
|
||||
Types live in:
|
||||
- `src/lib/gateway/GatewayClient.ts`
|
||||
|
||||
### Connect handshake
|
||||
|
||||
The first *protocol frame* from the browser must be `req(connect)`. The WS proxy:
|
||||
- Rejects non-`connect` frames until connected.
|
||||
- Opens an upstream WS to the configured Gateway URL.
|
||||
- Injects `auth.token` into the connect params if the connect frame does not already contain a token, and if it does not include a device signature.
|
||||
- Returns `studio.gateway_token_missing` only when no browser auth is present and no host token can be resolved.
|
||||
- Sets an `Origin` header for the upstream WebSocket derived from the upstream URL (and normalizes loopback hostnames to `localhost`).
|
||||
|
||||
Code:
|
||||
- Connect enforcement + token injection: `server/gateway-proxy.js`
|
||||
|
||||
### Connect failures
|
||||
|
||||
On failure to load settings or open upstream, the proxy sends an error `res` for the connect request (when possible) and then closes the WS.
|
||||
|
||||
Important detail (how errors become actionable in the UI):
|
||||
- The browser-side Gateway client (`src/lib/gateway/openclaw/GatewayBrowserClient.ts`) closes the WebSocket with close code `4008` and a reason like `connect failed: <CODE> <MESSAGE>` after it receives a failed `res(connect)`. `GatewayClient.connect()` parses that close into `GatewayResponseError(code, message)` for UI retry policy and user-facing errors.
|
||||
- Separately, the proxy may also close with `1011` / `connect failed`; the “connect failed: …” close reason that the UI parses is produced by the browser client, not the proxy.
|
||||
- WebSocket close reasons are truncated to 123 UTF-8 bytes in the browser client to avoid protocol errors on long messages.
|
||||
|
||||
Error codes used by the proxy include:
|
||||
- `studio.gateway_url_missing`
|
||||
- `studio.gateway_token_missing`
|
||||
- `studio.gateway_url_invalid`
|
||||
- `studio.settings_load_failed`
|
||||
- `studio.upstream_error`
|
||||
- `studio.upstream_closed`
|
||||
|
||||
## Reconnects And Retries
|
||||
|
||||
There are two layers of retry behavior:
|
||||
|
||||
- Transport reconnect (after a successful hello): the vendored browser client reconnects the browser->Studio WebSocket with backoff when it closes, and continues emitting events after reconnect. See `src/lib/gateway/openclaw/GatewayBrowserClient.ts`.
|
||||
- Initial connect failure retry: when the initial `connect` handshake fails (for example bad token), `GatewayClient.connect()` tears down the vendored client and returns a rejected promise; `useGatewayConnection()` may schedule a limited re-attempt unless the error code is known non-retryable. See `resolveGatewayAutoRetryDelayMs` in `src/lib/gateway/GatewayClient.ts`.
|
||||
|
||||
## Studio Access Gate
|
||||
|
||||
When Studio is bound to a public host, `STUDIO_ACCESS_TOKEN` is required. For loopback-only binds, it remains optional. When enabled, Studio enforces a simple access gate:
|
||||
- HTTP: blocks `/api/*` routes unless the correct `studio_access` cookie is present.
|
||||
- WebSocket: blocks `/api/gateway/ws` upgrades unless the cookie is present.
|
||||
|
||||
Files:
|
||||
- Gate implementation: `server/access-gate.js`
|
||||
- Gate integration for WS upgrades: `server/index.js`
|
||||
|
||||
## Streaming: What the Gateway Sends and How Studio Uses It
|
||||
|
||||
Studio classifies gateway events by `event` name:
|
||||
- `presence`, `heartbeat`: summary refresh triggers
|
||||
- `chat`: runtime chat messages (delta/final)
|
||||
- `agent`: runtime per-stream deltas (assistant/thinking/tool/lifecycle)
|
||||
|
||||
Code:
|
||||
- Classification: `src/features/agents/state/runtimeEventBridge.ts`
|
||||
- Execution: `src/features/agents/state/gatewayRuntimeEventHandler.ts`
|
||||
|
||||
## Live Fields vs Committed Transcript (Why Streaming Can “Look Weird”)
|
||||
|
||||
Studio intentionally separates:
|
||||
- Live streaming UI: `AgentState.streamText` and `AgentState.thinkingTrace` are updated via `queueLivePatch`, which batches patches and coalesces multiple deltas before they hit React state (`src/app/page.tsx`).
|
||||
- Committed transcript: `AgentState.outputLines` is appended via `appendOutput`. These are the lines that become the durable on-screen transcript and are later merged with `chat.history` results (`src/features/agents/state/store.tsx`).
|
||||
|
||||
This split is why you can see:
|
||||
- “live” assistant output update rapidly at the bottom card during a run
|
||||
- then a finalized assistant message (plus tool lines / thinking trace / meta timestamp) appear in the transcript on `final`
|
||||
|
||||
### `event: "chat"` payload
|
||||
|
||||
Studio treats `chat` events as the canonical “message” stream for transcript completion. Expected fields:
|
||||
- `runId`
|
||||
- `sessionKey`
|
||||
- `state`: `delta | final | aborted | error`
|
||||
- `message` (shape varies; Studio extracts text/thinking/tool metadata defensively)
|
||||
|
||||
Key behaviors (Studio-side):
|
||||
- Ignores user/system roles for transcript append (but uses them for status/summary).
|
||||
- User messages shown in the transcript are primarily from local optimistic send and from `chat.history` sync (not from runtime `chat` user-role events).
|
||||
- On `final`, appends:
|
||||
- a `[[meta]]{...}` line (timestamp and thinking duration when available)
|
||||
- a `[[trace]]` thinking block when extracted
|
||||
- tool call/result markdown lines when present
|
||||
- the assistant text (if any)
|
||||
- If a `final` assistant message arrives without an extractable thinking trace, Studio may request `chat.history` as recovery.
|
||||
- `chat.send` is idempotency-keyed upstream and returns a started ack before async completion; this is why history reconciliation can race with runtime events and must be idempotent.
|
||||
|
||||
### `event: "agent"` payload
|
||||
|
||||
Studio uses `agent` events for live streaming and richer tool/lifecycle updates. Expected fields:
|
||||
- `runId`
|
||||
- `stream`: `assistant | tool | lifecycle | <reasoning stream>`
|
||||
- `data`: record with `text`/`delta` and stream-specific keys
|
||||
|
||||
Stream handling (high-level):
|
||||
- `assistant`: merges `data.delta` into a live `streamText` for the UI.
|
||||
- reasoning stream (anything that is not `assistant`, `tool`, `lifecycle` and matches hints like `reason`/`think`/`analysis`/`trace`): merged into `thinkingTrace`.
|
||||
- `tool`: formats tool call and tool result lines using `[[tool]]` and `[[tool-result]]`.
|
||||
- `lifecycle`: start/end/error transitions; if a run reaches `end` without chat final events, Studio may flush the last streamed assistant text as a fallback final transcript entry.
|
||||
|
||||
Code:
|
||||
- Runtime agent stream merge + append: `src/features/agents/state/gatewayRuntimeEventHandler.ts`
|
||||
|
||||
## How Chat UI Renders Streaming
|
||||
|
||||
Studio keeps an `outputLines: string[]` transcript per agent, plus live fields like `streamText` and `thinkingTrace`.
|
||||
|
||||
Rendering pipeline:
|
||||
- `outputLines` contains:
|
||||
- user messages as `> ...`
|
||||
- assistant messages as raw markdown text
|
||||
- tool call/results with prefixes `[[tool]]` and `[[tool-result]]`
|
||||
- optional meta lines `[[meta]]{...}` for timestamps and thinking durations
|
||||
- optional thinking trace lines `[[trace]] ...`
|
||||
- The panel derives structured chat items from `outputLines` and (optionally) live streaming state.
|
||||
- UI toggles that change rendering:
|
||||
- `showThinkingTraces`: hides/shows `[[trace]]` thinking entries.
|
||||
- `toolCallingEnabled`: when off, tool lines are hidden and some exec tool results may be shown as assistant text.
|
||||
|
||||
### Rendering contract
|
||||
|
||||
- Assistant markdown renders as assistant markdown. Studio does not wrap normal assistant markdown in a synthetic `Output` container.
|
||||
- Tool cards render only from explicit marker lines: `[[tool]]` and `[[tool-result]]`.
|
||||
- List-marker visibility comes from chat markdown styles in `src/app/styles/markdown.css`; stream parsing does not invent list bullets.
|
||||
|
||||
Files:
|
||||
- Chat panel UI: `src/features/agents/components/AgentChatPanel.tsx`
|
||||
- Transcript parsing into items: `src/features/agents/components/chatItems.ts`
|
||||
- Message extraction helpers (text/thinking/tool parsing): `src/lib/text/message-extract.ts`
|
||||
- Media line rewrite (images/audio/video rendered in markdown): `src/lib/text/media-markdown.ts`
|
||||
|
||||
## Sending Messages (Browser -> PI via Gateway)
|
||||
|
||||
Send path (high level):
|
||||
- UI submits a message through `sendChatMessageViaStudio()` which:
|
||||
- Sets agent state to running and clears live streams.
|
||||
- Optionally resets local transcript state for `/new` or `/reset` (local UI behavior).
|
||||
- Optimistically appends the user line (`> ...`) to the transcript.
|
||||
- Ensures session settings are synced once via `sessions.patch` (model/thinking/exec settings) before first send.
|
||||
- Calls `chat.send` with `idempotencyKey = runId` and `deliver: false`.
|
||||
|
||||
Stop path:
|
||||
- UI calls `chat.abort` to stop an active run.
|
||||
|
||||
Files:
|
||||
- Send operation: `src/features/agents/operations/chatSendOperation.ts`
|
||||
- Session settings sync transport: `src/lib/gateway/GatewayClient.ts`
|
||||
- Stop call site: `src/app/page.tsx`
|
||||
|
||||
## Post-Connect Side Effects (Local Gateway Only)
|
||||
|
||||
After a successful connection, Studio may mutate gateway config when the upstream gateway URL is local:
|
||||
- It reads `config.get` and may write `config.set` to ensure `gateway.reload.mode` is `"hot"` for local Studio usage.
|
||||
|
||||
File:
|
||||
- Reload mode enforcement: `src/lib/gateway/gatewayReloadMode.ts`
|
||||
|
||||
## Sequence Gaps (Dropped Events)
|
||||
|
||||
Gateway event frames may include `seq`. The vendored browser client tracks `seq` and reports gaps (`expected`, `received`) via `onGap`.
|
||||
|
||||
Studio behavior on gap:
|
||||
- Logs a warning.
|
||||
- Forces a summary snapshot refresh and reconciles running agents.
|
||||
|
||||
Files:
|
||||
- Gap detection: `src/lib/gateway/openclaw/GatewayBrowserClient.ts`
|
||||
- Gap handling: `src/app/page.tsx`
|
||||
|
||||
## History Sync (Recovery, Load More)
|
||||
|
||||
Studio can fetch history via `chat.history` and merge it into the transcript.
|
||||
|
||||
Key points:
|
||||
- Studio intentionally treats gateway history as canonical for timestamps/final ordering.
|
||||
- History merge is designed to avoid duplicates and reconcile local optimistic sends.
|
||||
- History parsing intentionally skips some system-ish content (heartbeat prompts, restart sentinel messages, and UI metadata prefixes). See `buildHistoryLines()` in `src/features/agents/state/runtimeEventBridge.ts`.
|
||||
- Transcript v2 can be toggled with `NEXT_PUBLIC_STUDIO_TRANSCRIPT_V2`.
|
||||
- Transcript debug logs can be enabled with `NEXT_PUBLIC_STUDIO_TRANSCRIPT_DEBUG`.
|
||||
|
||||
Files:
|
||||
- History operation: `src/features/agents/operations/historySyncOperation.ts`
|
||||
- Transcript merge/sort primitives: `src/features/agents/state/transcript.ts`
|
||||
|
||||
## Exec Approvals In Chat (Related To “PI Runs”)
|
||||
|
||||
Some runs require exec approval. These are surfaced as in-chat cards and are handled separately from the `chat`/`agent` runtime stream.
|
||||
|
||||
Files:
|
||||
- Event to pending-card state: `src/features/agents/approvals/execApprovalEvents.ts`
|
||||
- Resolve operation: `src/features/agents/approvals/execApprovalResolveOperation.ts`
|
||||
- Wiring (subscribe + render): `src/app/page.tsx`, `src/features/agents/components/AgentChatPanel.tsx`
|
||||
|
||||
## Media Rendering (Images From Agent Output)
|
||||
|
||||
If an agent outputs lines like:
|
||||
- `MEDIA: /home/ubuntu/.openclaw/.../image.png`
|
||||
|
||||
Studio may render them inline:
|
||||
1. UI rewrites eligible `MEDIA:` lines into markdown images (``) but avoids rewriting inside fenced code blocks.
|
||||
2. The browser requests `/api/gateway/media`.
|
||||
3. The API route reads the image either locally (only under `~/.openclaw`) or over SSH for remote gateways, and returns the bytes with the correct `Content-Type`.
|
||||
|
||||
Files:
|
||||
- Rewrite helper: `src/lib/text/media-markdown.ts`
|
||||
- Media API route: `src/app/api/gateway/media/route.ts`
|
||||
- SSH helper + env vars (`OPENCLAW_GATEWAY_SSH_TARGET`, `OPENCLAW_GATEWAY_SSH_USER`): `src/lib/ssh/gateway-host.ts`
|
||||
|
||||
## Debugging Checklist (When Chat “Feels Buggy”)
|
||||
|
||||
Start with the hop where symptoms appear.
|
||||
|
||||
WS bridge / connectivity:
|
||||
- Studio server logs (proxy): `server/gateway-proxy.js`
|
||||
- Common failures: wrong `ws://` vs `wss://`, missing token, gateway closed, upstream TLS mismatch
|
||||
|
||||
Streaming correctness (missing/duplicated output):
|
||||
- Event classification + runtime stream merge: `src/features/agents/state/gatewayRuntimeEventHandler.ts`
|
||||
- Text/thinking/tool extraction quirks: `src/lib/text/message-extract.ts`
|
||||
- UI item derivation and collapsing rules: `src/features/agents/components/chatItems.ts`
|
||||
- Dedupe of tool lines per run + closed-run ignore window: `src/features/agents/state/gatewayRuntimeEventHandler.ts`
|
||||
|
||||
History and ordering issues:
|
||||
- `chat.history` merge logic and dedupe: `src/features/agents/operations/historySyncOperation.ts`
|
||||
- Transcript entry ordering/fingerprints: `src/features/agents/state/transcript.ts`
|
||||
|
||||
Media not rendering:
|
||||
- `MEDIA:` rewrite behavior and code-fence skipping: `src/lib/text/media-markdown.ts`
|
||||
- Image fetch route behavior (local vs SSH, allowlisted extensions, size limits): `src/app/api/gateway/media/route.ts`
|
||||
|
||||
If you need Gateway-side observability:
|
||||
- Capture the exact `connect` settings used by Studio (URL + token are stored server-side in the Studio settings file).
|
||||
- Inspect Gateway logs on the Gateway host using your environment’s service/log tooling.
|
||||
Reference in New Issue
Block a user