Core Platform

  • Gateway WebSocket control plane with sessions, presence, config, cron, webhooks, Control UI, and Canvas host
  • CLI surface: gateway, agent, send, wizard, and doctor commands
  • Pi agent runtime in RPC mode with tool streaming and block streaming
  • Session model: main for direct chats, group isolation, activation modes, queue modes, reply-back
  • Media pipeline: images/audio/video, transcription hooks, size caps, temp file lifecycle

Tools & Automation

Browser Control

Dedicated clawd Chrome/Chromium browser with full automation capabilities:

  • Web browsing and navigation
  • Form filling and submission
  • Data extraction from websites
  • Snapshot capture
  • File uploads
  • Profile management

Canvas

Agent-driven visual workspace with A2UI:

  • A2UI push/reset operations
  • JavaScript evaluation
  • Snapshot capture
  • Visual workspace for agent interactions

Nodes

Device-level capabilities:

  • Camera snap/clip
  • Screen recording
  • location.get for location services
  • System notifications
  • macOS-only: system.run and system.notify

Cron & Automation

  • Cron jobs: Schedule recurring tasks
  • Wakeups: Scheduled wake-up calls for the agent
  • Webhooks: External triggers via HTTP endpoints
  • Gmail Pub/Sub: Real-time email triggers

Skills Platform

  • Bundled, managed, and workspace skills
  • Install gating + UI
  • Community skill registry (ClawdHub)
  • Self-improving: can write its own skills

Voice & Speech

Voice Wake

Always-on speech recognition for macOS/iOS/Android:

  • Continuous listening
  • Wake word detection
  • Push-to-talk support
  • Multi-language support

Talk Mode

Continuous conversation with ElevenLabs text-to-speech:

  • Voice responses
  • Audio message support
  • Custom voice profiles
  • Natural conversation flow

Memory & Context

Persistent Memory

  • Remembers you and becomes uniquely yours
  • Your preferences, your context, your AI
  • Context persists 24/7 across sessions

Memory Files

  • Daily notes formatted in Markdown
  • Auto-generated each day
  • Plain text log of interactions
  • Searchable and editable
  • Integrate with Obsidian, Raycast, or Hazel

Skills & Extensibility

Skills Platform

  • Extend with community skills or build your own
  • Self-improving: can write its own skills
  • Ask it to add functionality, and it creates the code
  • Skills stored in workspace directory

ClawdHub

  • Minimal skill registry
  • Agent can search for skills automatically
  • Pull in new skills as needed
  • Community-contributed skills

System Access

Full System Access

  • Read and write files
  • Run shell commands
  • Execute scripts
  • Set up cron jobs
  • Automate workflows
  • Full access or sandboxed—your choice
  • Can improve itself by modifying its own configuration

Browser Control

  • Browse the web
  • Fill forms
  • Extract data from any site
  • Full automation capabilities
  • Dedicated Chrome/Chromium instance

Apps & Nodes

macOS App

  • Menu bar control plane
  • Voice Wake/PTT
  • Talk Mode overlay
  • WebChat + debug tools
  • Remote gateway control over SSH

iOS Node

  • Pairs as a node via the Bridge
  • Canvas surface
  • Voice Wake
  • Talk Mode
  • Camera and screen recording
  • Bonjour pairing

Android Node

  • Pairs via the same Bridge + pairing flow as iOS
  • Exposes Canvas, Camera, and Screen capture commands
  • Talk Mode support
  • Optional SMS support

macOS Node Mode

  • system.run/notify exposure
  • Canvas/camera access
  • System-level integrations

Multi-Agent & Routing

  • Multi-agent routing: Route inbound channels/accounts/peers to isolated agents
  • Workspaces: Per-agent sessions and isolation
  • Session tools: sessions_list, sessions_history, sessions_send for agent-to-agent communication
  • Group routing: Mention gating, reply tags, per-channel chunking

Next Steps