Gemba Remote — design proposal
Gemba Remote — design proposal
Status: draft for review (2026-05-11)
Author: mike
Supersedes / extends: none yet — this is a net-new epic family layered on top of gm-root and existing phases gm-e1..gm-e14
Outcome: laptop-closed operation. A user closes the lid and Gemba keeps planning, dispatching, observing, and reporting. They drive it from a phone — primarily through chat (Slack/Discord) — and occasionally from a browser at the cloud URL.
1. Why this exists
The v1 design (see README.md and gm-root) assumes Gemba runs as a single binary on the same workstation as gt, bd, the local Dolt server, and the tmux sessions hosting agents. Locked decision #8 is localhost-by-default; gm-e14 ships either a make install source build or a docker run image; “remote access” today means binding 0.0.0.0 with token+TLS on the same operator-owned box.
That model has a hard ceiling:
- The Dolt database is runtime state pinned to one machine. Branch switches, restarts, and “another project’s Dolt server is running on this port” already break daily flow (we hit this above when probing
bd). - Agents (tmux on Gas Town v1, k8s pods on Gas City) require the box to stay awake and on the network. Closing the laptop = halting the rig.
- Mobile is currently out-of-scope (
gm-rootnon-goals: “Mobile native apps”). - “Drive an agent” today means SSHing in or opening
localhost:7666over a Tailscale link.
We want to remove all three constraints without discarding the architecture we’ve spent gm-e2..e14 building. This document specifies the smallest set of changes — runtime topology, data path, agent control plane, and mobile control surface — that get us to laptop-closed operation while preserving the locked decisions (or explicitly amending the ones that have to give).
2. Goals
- Managed data tier. Beads/Dolt state lives on Dolt Hub (or a self-hosted equivalent). Gemba and
bdconnect over the wire; no laptop-pinned Dolt server. - Cloud-hosted Gemba server. A single canonical Gemba instance reachable at a stable URL (
https://gemba.<org>.<tld>/). Browser SPA, REST/SSE, OIDC auth. - Remote agent hosts. Agents run on machines that are not the user’s laptop — could be the same box as the Gemba server, a separate VM, a Mac mini at home, a k8s pod, or a Vercel Sandbox. Hosts opt-in by running a
gemba-agentdaemon that registers with the server. - Chat-first mobile control. A Slack/Discord bot is the primary mobile UX: review the board, dispatch convoys, peek at sessions, approve escalations, ack incidents, all from a phone. The web SPA is the secondary mobile UX (read-heavy, capped at what’s safe on a small screen).
- Laptop-closed. The laptop is one of N possible clients. The system runs without it. Nothing in the critical path requires anything on
~/.
3. Non-goals (for this design)
- Multi-tenant SaaS. Single-org / single-team install. Anything resembling Linear’s tenant model is post-v1.1.
- Federation across multiple Gemba servers. Still
fed:safe / fed:bridge / fed:blockedlabels on items, no cross-server WorkItems. - Replacing the locked WorkPlane/OrchestrationPlane abstraction. This rides on top of those interfaces.
- Native iOS/Android apps. Mobile = chat bots + responsive SPA. (See gm-root non-goals; we are not lifting that one.)
- Per-capability permission scoping (still all-or-nothing per gm-root #8 amendment scope).
4. Existing design we’re building on
A condensed picture of what’s already in beads / code, because every part of the remote design hangs off something here.
| Source | What it gives us | What changes for remote |
|---|---|---|
gm-root locked decisions | Sidecar binary, Go single-binary, adaptor-agnostic UI, ZFC, never-write-private-storage, mutation nonce, localhost-default auth | #1 still holds (Gemba is still a single binary, just deployed remote). #8 needs an amendment: remote production deploys require OIDC + TLS, no local-only fallback. #9 still holds (we still go through bd / gt / gc, but bd itself reaches a remote Dolt). |
| gm-e3 Core contracts | Adaptor interfaces, capability manifests, conformance harness, three transports `api | jsonl |
| gm-e4 Transport | HTTP API, OpenAPI, SSE hub, mutation nonce | Hub must survive process restarts and reconnects (event replay window). Nonces must be issued by the server, not the laptop. |
| gm-e5 Auth (token, TLS, OIDC stub) | OIDC was stubbed for v1.1 | OIDC graduates to v1 critical path for the remote profile. Token auth stays for daemon-to-server (agent host registration). |
| gm-e6 Beads adaptor | bd --json shim | Adaptor configures bd with a remote Dolt connection string instead of relying on a local dolt sql-server. |
| gm-e7 Gas Town adaptor | tmux-based sessions on the local box | Generalize: the adaptor talks to a host identified by ID, not to localhost. The host’s gemba-agent proxies gt / tmux attach. |
| gm-e10 Gas City stub | k8s/subprocess/exec providers | Pluggable workspace kinds already anticipate non-local execution; we use them for real here. |
| gm-e11 Cross-cutting | EscalationRequest, CostMeter, Sprint+TokenBudget, evidence | These primitives are exactly the chat surface — every one becomes a Slack/Discord interaction. |
| gm-e12 SPA | Adaptor-agnostic SPA, capability gates | Add a mobile-responsive mode. No new vocabulary; same components rendered for a 390pt screen. |
| gm-e14 Release | make install and docker image | Adds two more install paths: a Helm chart (or Compose stack) for the server, and a gemba agent subcommand for hosts. |
The point is: none of the four planes (Work / Orchestration / Transport / Auth) need to be re-architected. They need a remote profile.
5. Target topology
┌──────────────────────────┐ │ Dolt Hub (managed) │ │ beads schema │ └─────────┬────────────────┘ │ mysql wire │ ┌──────────────┴──────────────┐ │ │ │ Gemba Server (cloud) │ ← `gemba serve --profile remote` │ - HTTP API + SPA │ │ - SSE hub │ │ - Beads adaptor (bd→Hub) │ │ - Orchestration adaptor │ │ - Chat bridge (Slack/DC) │ │ - OIDC auth │ │ │ └──┬───────┬──────────┬───────┘ │ │ │ wss/ │ HTTPS│ Slack / mTLS │ │ Discord │ │ (webhooks + ┌──────▼───┐ │ socket mode) │ Agent │ │ │ │ Host A │ │ ▼ │ (tmux) │ │ ┌───────────┐ │ gemba- │ │ │ Mobile │ │ agent │ │ │ (phone) │ └──────────┘ │ └───────────┘ │ ┌────────▼─────────┐ │ Browser (laptop │ │ closed → reopen) │ └──────────────────┘
┌──────────┐ ┌──────────┐ │ Agent │ │ Agent │ │ Host B │ │ Host C │ │ (k8s) │ │ (sandbox)│ └──────────┘ └──────────┘5.1 Dolt Hub as the data tier
- Beads writes go through
bdas today (gm-root #9 preserved); the only thing that changes isbd’s configured Dolt endpoint.bd dolt pushbecomes a no-op in the remote profile (the server is Dolt’s client). - Branching model: one
mainbranch carries production state. Per-rig or per-experiment branches still work — Dolt Hub already supports them — but Gemba’s UI defaults tomain. - Connection security: Dolt Hub user with scoped DB access, credential lives in the Gemba server’s secret store, never on agent hosts. Agent hosts never speak SQL.
- Backup: Dolt Hub handles snapshots. Gemba server runs a nightly
dolt cloneto S3 as belt-and-braces, surfaced under “Settings → Backups.” - Schema migrations:
bd migrateruns on the server during boot with a leader-elected lock so multiple server replicas (post-v1.1) don’t double-migrate.
5.2 Cloud-hosted Gemba server
- Single Go binary, same one we ship today, with a new
--profile remoteflag (or auto-detected from env:GEMBA_PROFILE=remote). - Deploy targets: container (gm-e14 image, unchanged) on Fly.io / Render / Vercel-style platform / k3s / a Mac mini. We pick one as the reference deploy for docs; the binary doesn’t care.
- Stateless aside from the SSE hub’s in-memory event ring (10k events, ~5 min replay). All durable state is Dolt Hub plus a small operational store (sessions, nonces, agent-host registrations) — see §6.
- mTLS terminates at the platform’s load balancer; Gemba speaks HTTP internally.
- Outbound: Dolt Hub (mysql), Slack/Discord (websocket + REST), agent hosts (wss).
5.3 Remote agent hosts
A gemba-agent is a new subcommand of the same binary. It is not a separate package — it ships from cmd/gemba/agent.go. Boot sequence:
gemba agent register --server https://… --token <one-time-enrollment-token>- exchanges enrollment token for a long-lived host credential (JWT, rotating monthly)
- prints a host ID; user copies the ID and labels it (
mac-mini-1,prod-runner-east, …) from the SPA or Slack
gemba agent runopens a wss connection to the server. Reverse-proxy model: the server initiates requests over the websocket; the agent serves them on its local box (tmux, docker, k8s, etc).
What an agent host exposes to the server:
Workspace.kindit supports (tmux only, or tmux+container, etc — declared in capability manifest, same shape as gm-e10).- A health stream (CPU, RAM, agent slots free).
- A session-proxy: when the server says “attach this user to session
polecat-7”, the agent multiplexes that user’s SSE/wss into a livetmux pipe-panestream. - A dispatch endpoint: “spawn convoy X with formula Y” → agent invokes
gt sling …locally.
What the agent host does not see:
- Dolt Hub credentials.
- Other tenants’ work (single-tenant for v1.1).
- Anyone else’s session output.
Same-box and different-box agents are the same code path; “same box as server” is just a deployment choice that saves a network hop.
5.4 Browser SPA — unchanged semantics, new origin
- The SPA from gm-e12 ships from the Gemba server, exactly as it does today. The only change: the API base is the public URL, not
http://localhost:7666. - The SPA gets a “Host” picker (top-bar select) when more than one agent host is registered. Dispatching a convoy chooses a host (or “any with capacity”).
- Mobile-responsive pass on gm-e12 work — same components, narrower layout. Read-and-approve UX on mobile web; the bulk grid + dep graph editor remain desktop-only for v1.1.
6. New domain types & primitives
Add to internal/model/:
type HostID string
type AgentHost struct { ID HostID Label string // user-supplied, e.g. "mac-mini-1" Kinds []WorkspaceKind // tmux | container | k8s_pod | vm | exec | subprocess Capabilities CapabilityManifest // same shape as adaptor manifests Status HostStatus // online | degraded | offline | enrolling | revoked LastSeen time.Time EnrolledBy UserID EnrolledAt time.Time}
type Subscription struct { ID SubID UserID UserID Channel ChatChannel // slack:#gemba-ops, discord:#town, email:user@… Filter EventFilter // labels, hosts, escalations-only, … Throttle ThrottlePolicy // collapse, dedupe, quiet hours}Operational store (separate from Dolt Hub, lives on the Gemba server, can be SQLite-on-volume or Postgres):
agent_hosts(above)user_sessions(browser/OIDC; chat-link tokens)chat_links(User ↔ Slack user ID ↔ Discord user ID)subscriptionsmutation_nonces(server-issued, replacing today’s per-process map)audit_log(append-only mirror of mutations; the truth is still in Dolt)
This is operational state, not work-tracker state. It never goes in Dolt Hub. If the server dies and is rebuilt, hosts re-enroll and users re-link — annoying but recoverable, and the Dolt Hub data survives.
7. Chat surface — Slack and Discord
Same feature set, two adapters. Built on the same ChatPlane interface so adding Microsoft Teams later is purely adapter work.
7.1 Capabilities
Read commands (anyone in an authorized channel):
/gemba ready— top N ready beads for the rig/gemba show gm-123— bead detail card (status, owner, labels, deps, recent comments)/gemba sprint— current sprint burn-up + token budget remaining/gemba hosts— agent-host roster: label, kind, free slots, last seen/gemba sessions— live agents: which rig, which bead, runtime, last activity
Action commands (authorized users only — OIDC identity bridged via chat-link):
/gemba claim gm-123— claim a bead/gemba dispatch <formula> [host:<label>]— start a convoy; defaults to least-loaded host/gemba peek <session>— get a 60-line tail + a one-click link to a full session SSE in browser/gemba pause <session>//gemba resume <session>/gemba close gm-123 [--reason …]/gemba escalate <session> "<reason>"
Approval interactions (push, not pull): every EscalationRequest (gm-e11) and every mutation issued from outside the SPA fans out as a chat card with Approve / Deny / Mute buttons. Approval is the nonce — the chat card carries the X-GEMBA-Confirm value; clicking Approve POSTs it back through the bot, server validates, mutation lands. This means mobile approval is a first-class flow, not a workaround. Locked decision #7 holds without change.
Push notifications:
- Escalations → DM the rig owner.
- Stuck-agent over threshold → channel post.
- Sprint at 80/95/100% of token budget → channel post (three-tier inform/warn/stop from gm-e11).
- Cost spike → DM the owner.
- New comment on a bead the user owns → DM (configurable).
7.2 Auth bridge
- Slack/Discord OAuth flow on first interaction: user runs
/gemba link, gets a short-lived code, types it into the SPA while signed in via OIDC. After that, chat user ID ↔ Gemba user ID is durable inchat_links. - Unlinked users get read-only results in public channels and nothing in DMs.
- Channel allow-lists:
gemba.tomldeclares which channels can receive what (#gemba-opsgets everything,#generalgets nothing).
7.3 Threading model
A bead, a session, and a convoy each map to a canonical thread. Posting a peek or an escalation against the same session always replies in the same thread, so a phone scroll reads as one conversation per workstream.
7.4 Discord parity
Slash commands and components in Discord. Same JSON contracts. We pick one as the reference adapter for the first deliverable (Slack, only because the API is more mature for our shape), but neither is preferred in the design.
8. Auth, in detail
This is where locked decision #8 has to flex. Proposed amendment to gm-root:
#8 (amended). Gemba runs in one of two profiles:
local(default forgemba serve): unchanged from today. Localhost bind, optional token, optional TLS. Mutations gated by nonce.remote: non-loopback bind, OIDC required, mTLS required, agent hosts authenticate with rotating JWTs, chat bots authenticate with signing-secret-verified webhooks.--dangerously-skip-permissionsis rejected at startup in this profile. The flag exists forlocalonly.
Concretely:
- Browser → server: OIDC (Google, GitHub OAuth, Okta — pick one for the reference deploy). Sessions issued as short-lived JWTs + refresh tokens stored httpOnly.
- Agent host → server: mTLS over wss, plus a bearer JWT scoped to that host ID. Rotation monthly, automatic.
- Chat bot → server: signing secret on inbound webhooks; outbound calls (bot → server) use a service token with no user permissions — every action carries the chat user’s identity so the server can authorize per-user.
- Server → Dolt Hub: username + service token from secret store. Single credential, never leaves the server process.
9. Reliability & operations
Things that must be true for laptop-closed:
- Server uptime: standard cloud SLOs; not exotic. A single-replica deployment is fine for v1.1; HA is post.
- Crash recovery: the SSE hub loses in-flight events on restart; subscribers reconnect and pull a 5-minute replay from the ring buffer (already designed in gm-e4). Anything older comes from Dolt history.
- Agent reconnect: the agent’s wss reconnects with exponential backoff, capped at 60s. The server treats a host as
degradedafter 30s of silence,offlineafter 5 min, and pages chat atoffline. - Network partitions: an agent that loses the server keeps its local tmux/k8s sessions alive. When it reconnects, it re-announces what it found running. The server reconciles against its
agent_hosts.observed_sessionsmirror. - Cost guardrails: Sprint/TokenBudget enforcement (gm-e11) gains a hard-stop tier that the server enforces in the dispatch path; agents can’t be told to spawn over budget. This is critical when the user can’t see the laptop.
10. Phasing — proposed new epics
These are draft titles for review. None are filed in beads yet (per the user’s “once approved we’ll review against existing beads”). They depend on the existing phases as noted.
| ID (proposed) | Title | Depends on | Scope sketch |
|---|---|---|---|
gm-e15 | Remote profile foundation — --profile remote, OIDC graduation, mTLS, server-issued nonces | gm-e4, gm-e5 | Profile flag, OIDC adapter (one reference IdP), mTLS termination contract, ops store (sqlite/postgres), audit log |
gm-e16 | Dolt Hub data tier — managed Dolt, schema migration on boot, S3 belt-and-braces backup | gm-e6 | bd adaptor wired to remote Dolt, migration leader-election, backup job, restore drill documented |
gm-e17 | Agent host protocol — gemba agent subcommand, enrollment, reverse-proxy session bridge | gm-e7, gm-e10 | wss protocol spec, capability manifest extension for hosts, session multiplex, reconnect/replay semantics |
gm-e18 | Chat plane — Slack reference adapter | gm-e11, gm-e15 | ChatPlane interface, Slack adapter (slash + interactive components), chat-link OAuth, approval-as-nonce flow, subscription model |
gm-e19 | Chat plane — Discord adapter | gm-e18 | Discord adapter to the same ChatPlane interface; conformance against shared test suite |
gm-e20 | Mobile-responsive SPA pass | gm-e12 | Breakpoints for ≤768px on read flows + approval flows; bulk-edit + dep-graph remain desktop |
gm-e21 | Remote deploy reference — container image hardening, Helm chart / Compose stack, runbook | gm-e14, gm-e15, gm-e16, gm-e17 | One reference target chosen, docs site updated, restore/rotate/upgrade runbooks |
gm-e22 | Notification rules & quiet hours | gm-e18 | Subscription throttling, dedupe, quiet hours, escalation routing rules |
A leaner v1.1-remote (call it “minimum laptop-closed”) is gm-e15 + gm-e16 + gm-e17 + gm-e18 + gm-e21. Discord (gm-e19), mobile web polish (gm-e20), and rules engine (gm-e22) can fast-follow.
11. Locked-decision deltas
For each, either no change, an amendment, or a deferred reconsideration.
| # | Decision | Verdict |
|---|---|---|
| 1 | Standalone sidecar binary | Unchanged. Still one binary; deployed remote. |
| 2 | Go single binary, embed SPA | Unchanged. |
| 3 | React + TS + Vite stack | Unchanged. Mobile pass is breakpoints, not a new stack. |
| 4 | Adaptor-agnostic UI | Unchanged. Slack/Discord live in internal/adapter/chat/<vendor>/, mirroring the WorkPlane/OrchPlane pattern. |
| 5 | Pluggable workspace kinds | Unchanged. This is what makes “agent on any box” cheap. |
| 6 | Multi-workspace, not federated | Unchanged. Single Gemba server, multiple agent hosts, one logical org. |
| 7 | Mutation nonce, --dangerously-skip-permissions | Unchanged. Chat approvals are nonces. Skip flag rejected in remote profile. |
| 8 | Localhost default, auth gate | Amended — see §8. Two profiles, remote mandates OIDC + mTLS + signed chat webhooks. |
| 9 | Never write any backend’s private storage | Unchanged. We still go through bd / gt / gc. bd reaches Dolt Hub instead of a local socket — that’s bd’s concern, not ours. |
| 10 | Declarative UX (desired vs observed) | Unchanged. Agent hosts contribute observed_state from wherever they live. |
| 11 | ZFC for the UI | Unchanged. Chat commands surface options; humans decide. No bot policy. |
| 12 | Distribution | Amended at gm-e14 follow-on. Add a hardened container image + Helm chart / Compose stack as a third install path. Source build and current docker image untouched. |
Out-of-scope items in gm-root that we are explicitly not lifting:
- Cross-workspace federation — still out.
- Mobile native apps — still out (chat + responsive web only).
- Multi-transport adaptors — still one transport per adaptor; the ChatPlane is its own adaptor pair, not a transport.
12. Open questions for review
- Reference deploy target. Fly.io vs. a single Hetzner VM with k3s vs. Vercel-style platform vs. Mac mini at home. Each has a different ops story. Pick one for
gm-e21. - OIDC IdP for the reference. Google Workspace is easiest for a solo-op deploy; GitHub OAuth fits the dev audience; Okta if we want to make enterprise-friendly noise. Recommendation: GitHub OAuth for v1.1, swap to a real IdP at v1.2.
- Dolt Hub tenancy. One database per Gemba install vs. one database per rig. Per-install is simpler and matches
gm-root#6 (multi-workspace, not federated); per-rig would mirror today’s~/gt/<rig>/.beadslayout. Recommendation: one database, schema-namespaced per rig — simpler ops, easier cross-rig views. - Self-hosted Dolt option. Do we support
docker run dolthub/dolt-sql-serveras a substitute for Dolt Hub? For air-gapped users, probably yes. Adds one config knob and a “self-hosted-dolt” runbook. - Chat-first defaults. Should
/gemba dispatchrequire explicit host selection, or default to “any with capacity”? Defaulting is mobile-friendly but hides resource choices. Recommendation: default to capacity-aware, surface the chosen host in the response card. - Session peek vs full attach on mobile. A 60-line tail is fine on a phone; a full
tmux attachover wss is not. Are there flows where partial peek is insufficient and we’d need a different mobile-attach UX? Likely no for v1.1; revisit if users push back. - Quiet hours. Per-user (each user sets their own) or per-org (one schedule for everyone). Per-user is cleaner but per-org is what most orgs actually want for “don’t page anyone after midnight.” Recommendation: per-user with org override.
- The agent enrollment token UX. Today’s mental model: user generates a token in the SPA, pastes it into
gemba agent registeron the new box. Reasonable. Alternative: QR-code flow scanned from phone. Cute, probably unnecessary, easy to add later. - Migration path from v1 local install. A user running
gemba serveon their laptop today should be able togemba migrate local-to-remote --server https://…and have their~/gtDolt push to Dolt Hub, then their laptop becomes an agent host. Worth scoping undergm-e16or making its own ticket.
13. What this isn’t trying to solve
To prevent scope creep when this turns into beads:
- It is not redesigning Gas Town or Gas City. It uses them through the existing orchestration adaptor.
- It is not redesigning Beads. It changes
bd’s storage endpoint, not its semantics. - It is not building a SaaS product. One org, one server, one Dolt Hub DB.
- It is not adding native mobile apps. Chat + responsive web is the deal.
- It is not changing the conformance harness from gm-e3. Adaptors still pass groups A–F; this just adds a few capabilities to declare (
hosts.remote,chat.*).
14. Next step
Review this document and flag amendments. Once approved:
- Reconcile §10’s proposed epics against the existing
~/gt/gemba/.beads/issues.jsonl— some may already exist as children ofgm-e5/gm-e11/gm-e14and need to be promoted to epics or reparented. - File the new epics (
gm-e15..gm-e22) with the deps in §10. - File the locked-decision amendment to
gm-root(§11) as an explicitnotes:block, dated, with this document linked. - Pick the reference deploy + IdP (§12 q1, q2) so
gm-e15andgm-e21aren’t underspecified at kickoff.