* fix: include kanbanImmersive in immersiveOverlayActive calculation When Kanban board is open, HUD elements (camera preset buttons, edit toolbar, overlays) should be suppressed. The kanbanImmersive flag was defined but not included in the immersiveOverlayActive condition, causing HUD elements to remain visible. This fix adds kanbanImmersive to the immersiveOverlayActive calculation so HUD elements are properly hidden when the Kanban board is open. Co-authored-by: Luke The Dev <iamlukethedev@users.noreply.github.com> * Fix: Hide mini status bar when Kanban immersive overlay is open Wraps the bottom-left mini status bar (showing agent stats, vibe score, and control hints) with !immersiveOverlayActive check to match the behavior of other HUD elements like camera controls and toolbar. This ensures the status bar is properly hidden when the Kanban board or any other immersive overlay is active, maintaining a clean immersive experience. Co-authored-by: Luke The Dev <iamlukethedev@users.noreply.github.com> * chore: drop unrelated package-lock line from branch Co-authored-by: Luke The Dev <iamlukethedev@users.noreply.github.com> * universal-backend-plan * backend-neutral runtime seam * package.json update * feat: add Hermes gateway adapter as alternative to OpenClaw Adds a WebSocket adapter that lets Claw3D connect to a Hermes AI agent runtime without any changes to the frontend. The adapter implements the full Claw3D gateway protocol and bridges it to the Hermes HTTP API. Changes: - server/hermes-gateway-adapter.js: WebSocket bridge implementing the Claw3D gateway protocol against the Hermes HTTP API. Supports all core methods (agents, sessions, chat streaming, cron, config, files, approvals) and multi-agent orchestration via spawn_agent/delegate_task tools. Persists conversation history to ~/.hermes/clawd3d-history.json. - scripts/clawd3d-start.sh: All-in-one startup script that launches Hermes, the adapter, and the Next.js dev server with auto port conflict resolution. Alias as `claw3d` for convenience. - src/features/office/hooks/useCronAgents.ts: Hook that polls the gateway for cron-scheduled agents and surfaces them in the 3D office. - package.json: adds `hermes-adapter` npm script - .env.example: documents Hermes config vars - docs/hermes-gateway.md: setup guide and protocol reference Usage: npm run hermes-adapter # start adapter (connect to http://localhost:8642) npm run dev # start Claw3D, point browser at localhost:3000 # or: bash scripts/clawd3d-start.sh (starts everything automatically) Both OpenClaw and Hermes are supported simultaneously — the gateway URL in NEXT_PUBLIC_GATEWAY_URL determines which backend Claw3D connects to. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add read_agent_context tool for cross-agent coordination Agents can now read each other's conversation history via the read_agent_context tool, enabling the orchestrator to check what a sub-agent has done before re-delegating work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: wire Hermes office UX and role-aware runtime updates * feature update - demomode & hermes adapter * fix lint blockers * lintfix #2 * fix: stabilize retro office camera preset callbacks * Initial plan * fix: stabilize retro office overview preset hooks Agent-Logs-Url: https://github.com/gsknnft/Claw3D/sessions/9cc71555-591e-44cf-aec4-25affbdcb405 Co-authored-by: gsknnft <123185582+gsknnft@users.noreply.github.com> * feat: add truthful backend selection, Hermes adapter hardening, and demo gateway mode * fix: address bugbot review and finalize backend selection * fixed - onboarding and hermes calls * office systems roadmap * feat specs in docs * specs ready * feat: continue custom runtime seam and gateway alignment * custom lane wired * feat: add custom runtime provider path and office runtime alignment * runtime fixes * fix lukes findings * fix lukes findings #2 * stable UI & connect screen page -> overlay * better baseline for connection * stable providers & ui rendering * best launch yet * nearly no gateway on reconnect * auto reconnect last state * fix: preserve selected runtime across reconnects Keep backend selection aligned with the operator's chosen runtime instead of reviving a mismatched last-known-good adapter, and keep custom runtimes prompting for reconnect when Studio cannot auto-connect them. Made-with: Cursor --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Luke The Dev <iamlukethedev@users.noreply.github.com> Co-authored-by: Elias Pfeffer <eliaspfeffer@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: iamlukethedev <lucas.guilherme@smartwayslfl.com>
9.5 KiB
QA Department Spec
Fourth concrete office-system feature for Claw3D, completing the first real office loop: plan, coordinate, execute, review.
Goal
Add a QA department workflow to Claw3D so the office can visibly review, test, triage, and sign off on work before it is treated as complete.
The QA department should make review state legible in-world.
It is where the office asks:
- does this actually work?
- what failed?
- what is blocked?
- what is safe to ship?
Product Position
QA should not be just flavor.
It should be an operational system that connects:
- tasks
- agent work output
- reviews
- approvals
- regressions
- release-readiness
The QA department is the office’s verification layer.
Why This Feature Matters
Without a QA layer, the office can generate and coordinate work but not convincingly validate it.
QA adds:
- visible review state
- feedback loops
- bug triage
- approval pressure where needed
- a clearer path from "done writing" to "done safely"
It also pairs naturally with:
- bulletin board blockers
- meeting room review workflows
- task board status
- approval systems
Core Responsibilities
The QA department should handle:
- review intake
- test/result tracking
- bug triage
- regression visibility
- release gate / readiness signal
Primary Use Cases
Review Queue
Examples:
- a task is ready for QA
- an agent requests review
- a release candidate needs signoff
Bug Triage
Examples:
- classify failures
- route issues to the right owner
- mark severity
- surface blockers to the office
Regression Detection
Examples:
- recent change broke existing behavior
- previously passing workflow now fails
- approval flow or adapter integration regressed
Approval-Aware Review
Examples:
- code/run needs human approval before release-like action
- QA can recommend approval but not finalize it
- owners or leads can override or sign off
Release Readiness
Examples:
- green / yellow / red office-level signal
- unresolved blockers prevent completion
- review summary appears on bulletin board
V1 Scope
V1 should focus on clear office-level QA workflows, not a full CI system.
Recommended V1 scope:
- QA queue
- QA status per task or work item
- bug / blocker recording
- review outcome states
- office-visible readiness signal
Suggested Workflow Model
Recommended QA states:
queuedin_reviewchanges_requestedblockedapprovedfailedverified
Queued
Work has entered QA but has not been actively reviewed yet.
In Review
A QA agent or human reviewer is assessing the work.
Changes Requested
Work is not acceptable yet and must be revised.
Blocked
QA cannot proceed because a dependency, approval, or missing artifact prevents review.
Approved
Review is positive, but final release/ship behavior may still depend on a higher-level approval model.
Failed
Verification found concrete failure.
Verified
The work passed the required QA checks and is complete from the department’s perspective.
Suggested Data Model
V1 shape:
type QaStatus =
| "queued"
| "in_review"
| "changes_requested"
| "blocked"
| "approved"
| "failed"
| "verified";
type QaSeverity = "low" | "medium" | "high" | "critical";
type QaIssue = {
id: string;
title: string;
body: string;
severity: QaSeverity;
createdAt: string;
updatedAt: string;
authorType: "human" | "agent" | "system";
authorId?: string | null;
linkedTaskId?: string | null;
linkedAgentId?: string | null;
linkedSessionKey?: string | null;
resolved: boolean;
};
type QaReviewItem = {
id: string;
title: string;
status: QaStatus;
createdAt: string;
updatedAt: string;
assignedReviewerAgentId?: string | null;
linkedTaskId?: string | null;
linkedAgentId?: string | null;
linkedSessionKey?: string | null;
summary?: string | null;
issues: QaIssue[];
};
type QaDepartmentState = {
items: QaReviewItem[];
readiness: "green" | "yellow" | "red";
updatedAt?: string;
};
Relationship To Existing Systems
The QA department should plug into systems Claw3D already has.
Task Board / Kanban
The QA department should consume work from the task board.
Examples:
- task moves into a review-ready state
- QA item is created or updated
- blocked QA creates blocker visibility back on the bulletin board
Suggested relationship:
- task board = execution status
- QA department = verification status
Bulletin Board
The bulletin board should show the important QA outcomes.
Examples:
- "Build blocked on QA"
- "Regression found in Hermes adapter flow"
- "Release candidate verified"
Suggested card mapping:
- critical QA issue -> blocker card
- release-ready signal -> announcement card
- changes requested -> handoff card
Meeting Room
Review meetings should naturally feed into QA.
Examples:
- planning meeting creates work
- execution completes
- review meeting sends selected items into QA
- QA findings can be discussed in a follow-up review meeting
This makes the meeting room and QA department part of one loop instead of separate ideas.
Approvals
Claw3D already has approval-related surfaces.
The QA department should integrate with them conceptually, even if V1 is mostly local office state.
Important distinction:
- QA approval = "this looks good from verification"
- release approval = "a human or higher authority allows the next action"
Those are related but not identical.
GitHub / Review Surfaces
Claw3D already has review-adjacent UI, including GitHub-oriented immersive screens.
The QA department should be able to:
- reflect review outcomes
- ingest review summaries
- show whether work is waiting for review or returned with changes requested
In-World UX
The QA department should feel like a place in the office.
Possible visual forms:
- QA lab
- testing bullpen
- release desk
- audit wall
Behavior:
- queue visible in-world
- blocked items stand out clearly
- verified items visibly clear from the queue
- readiness state visible at a glance
The room should communicate office health, not just hold another panel.
Secondary UI
Also provide a non-spatial UI surface.
Good options:
- HQ sidebar panel
- immersive QA screen
- release/readiness panel
Users should be able to inspect:
- queued reviews
- open issues
- who owns each item
- overall readiness state
V1 Automation
Useful automations:
- create a QA item when a task enters review-ready state
- create blocker cards for high-severity QA issues
- update readiness color based on unresolved critical/high issues
- generate a short QA summary when an item leaves review
Keep automation conservative.
Avoid flooding the system with low-value noise.
Storage Model
V1 can be stored in office preferences, similar to bulletin board and whiteboard systems.
Suggested shape:
type OfficePreference = {
qaDepartment?: QaDepartmentState;
};
This keeps the feature:
- backend-neutral
- easy to persist
- easy to evolve later
Human Interaction Model
The human should be able to:
- open the QA queue
- inspect a review item
- mark status changes
- add issues
- resolve issues
- promote or reject readiness
Humans should remain the final arbiter when needed, especially for ship/release-style outcomes.
Agent Interaction Model
QA agents should be able to:
- review work items
- generate findings
- summarize likely regressions
- mark items as changes requested or verified
- surface blockers
Longer term:
- specialized QA agents may exist by area
- adapter QA
- UI QA
- release QA
- regression QA
Readiness Signal
The department should publish an office-level readiness state:
greenyellowred
Suggested meaning:
- green = no blocking QA issues
- yellow = warnings / pending review / moderate unresolved issues
- red = blocking failures or critical unresolved issues
This signal should be visible outside the QA room as well.
For example:
- bulletin board card
- office status banner
- release desk indicator
Out of Scope For V1
Do not include these initially:
- full CI orchestration
- external test runner infrastructure
- rich flake analytics
- cross-repo release orchestration
- advanced approval hierarchies
- fully automated release pipelines
V1 should be office workflow first.
Implementation Strategy
Recommended order:
- Define QA review item and issue schema.
- Add local persisted QA department state.
- Build a simple QA queue panel.
- Add readiness signal.
- Connect task board / review-ready states to QA queue creation.
- Emit bulletin board blockers or announcements from QA outcomes.
Existing Code Seams
This work should likely align with:
- task board state and transitions
- approval/review UI surfaces
- GitHub immersive review screens
- office performance / approvals analytics
- bulletin board and meeting room outputs from the new docs
The key is to avoid building QA as an isolated toy feature.
It should be another operational loop in the same office system.
Success Criteria
V1 is successful if:
- the office can visibly route work into QA
- QA findings can block or clear work in a legible way
- users can inspect review items and issues
- readiness state is visible at the office level
- QA outcomes can feed the bulletin board
Future Extensions
Once V1 is stable, follow-up work can add:
- QA meeting rituals
- release room / release wall
- specialized QA subteams
- automated regression summaries
- richer review analytics
- policy-aware signoff chains
Summary
The QA department should make verification a first-class part of office life.
It closes the loop between planning, execution, and trustworthy completion.