AI AGENTS

OpenClaw: Deploy Your Personal AI Assistant on WhatsApp & Telegram

Build and deploy your own AI assistant that lives in WhatsApp and Telegram. Voice interaction, browser automation, cron jobs, custom skills, and full personality design through AGENTS.md, SOUL.md, and MEMORY.md.

By Jose Nobile | Updated 2026-06-11 | 18 min read

What Is OpenClaw?
Architecture: Gateway, Nodes, Channels
Installation & Configuration
AGENTS.md / SOUL.md / MEMORY.md
WhatsApp & Telegram Setup
Voice Interaction
Skills Development
Cron Jobs & Automation
Browser Automation
Canvas UI
Hooks System
Environment Variables & .env
Node Pairing & Multi-Device
Latest Updates (2026)
Real Example: Jose's Setup

What Is OpenClaw?

OpenClaw is an open-source framework for deploying personal AI assistants across messaging platforms. Unlike generic chatbots, OpenClaw creates a persistent, personality-driven AI that knows you, remembers context across conversations, and can execute real-world actions -- from browsing the web to managing your calendar.

It is built on a distributed architecture with a central gateway, multiple nodes (each running on different machines), and channels (WhatsApp, Telegram, Slack, etc.). This design means your AI assistant can run on your laptop, a VPS, or a Raspberry Pi, and you interact with it from any messaging app.

What makes OpenClaw unique is its personality system. Through AGENTS.md, SOUL.md, and MEMORY.md files, you define not just what your assistant can do, but who it is -- its communication style, values, knowledge areas, and behavioral boundaries.

Architecture: Gateway, Nodes, Channels

CORE

Gateway

The central routing layer that connects channels to nodes. The gateway handles message routing, authentication, session management, and load balancing. It can run as a lightweight process on any server.

CORE

Nodes

Execution environments where your AI agent runs. Each node can host different capabilities (browser automation, file system access, specific tools). Multiple nodes can serve a single agent for redundancy.

CORE

Channels

Messaging platform integrations. WhatsApp (via WhatsApp Business API or web bridge), Telegram Bot API, Slack, and more. Each channel adapter translates platform-specific messages into OpenClaw's unified format.

FEATURE

Voice I/O (Whisper)

Full audio pipeline: incoming voice messages are transcribed via a local or remote Whisper server, and responses can be synthesized back to audio. Use cases include hands-free interaction while driving, dictating notes, and accessibility for users who prefer speech over typing.

FEATURE

Canvas UI

An optional web-based canvas interface for rich interactions that go beyond text. The canvas renders interactive cards, data tables, charts, and forms directly in the browser, complementing the messaging channels with a visual workspace. Use cases include dashboard views, drag-and-drop task management, and visual data exploration.

FEATURE

Node Pairing & Multi-Device

Pair multiple nodes to a single gateway by exchanging a pairing token. Each device (laptop, VPS, Raspberry Pi) registers as a separate node with its own capabilities. The gateway routes tasks to the best available node. Use cases include running heavy browser tasks on a cloud node while keeping file access on a local machine.

The architecture supports multiple concurrent users, each with their own agent personality and memory. A single gateway can route messages to different nodes based on user identity, message type, or load conditions.

Nodes communicate with the gateway via WebSocket connections, enabling real-time message delivery and bidirectional streaming for voice interactions. The protocol is designed for unreliable networks -- messages are queued and retried if a node disconnects temporarily.

Installation & Configuration

OpenClaw runs on Node.js 24 (recommended) or 22.16+ and requires a few environment variables for API keys and channel credentials. The installation process is streamlined for developers familiar with npm:

git clone https://github.com/open-claw/openclaw.git
cd openclaw
npm install
cp .env.example .env  # Configure your API keys
npm start

Configuration is split into two areas: infrastructure (gateway ports, node addresses, channel credentials) in .env and the environment config files, and personality/behavior in the Markdown files (AGENTS.md, SOUL.md, MEMORY.md).

Required API Keys

Anthropic API key (for Claude), WhatsApp Business API or web bridge token, Telegram Bot token from @BotFather. Optional: local Whisper + Qwen3-TTS for voice, Browserbase or local Puppeteer for browser automation.

Node Configuration

Each node defines its capabilities, available tools, and resource limits. Nodes can be configured to handle specific types of requests (text, voice, browser tasks) or serve as general-purpose executors.

Multi-Machine Setup

For advanced deployments, run the gateway on a VPS with static IP, and nodes on local machines or cloud instances. The gateway handles NAT traversal and keeps connections alive across network changes.

AGENTS.md / SOUL.md / MEMORY.md

The personality system is what transforms a generic AI into your personal assistant. Three Markdown files define the complete behavioral profile:

PERSONALITY

AGENTS.md

Defines the agent's capabilities, available tools, skills registry, and operational boundaries. This is the "what can it do" file -- listing every skill, its trigger conditions, and execution parameters.

PERSONALITY

SOUL.md

Defines the agent's personality, communication style, values, and behavioral principles. This is the "who is it" file -- tone of voice, language preferences, humor style, empathy level, and response patterns.

PERSONALITY

MEMORY.md

Persistent memory that the agent reads and writes. Contains user preferences, past interactions, learned facts, and contextual notes. The agent updates this file as it learns about you over time.

A well-designed SOUL.md transforms the user experience dramatically. Jose's assistant speaks in natural Argentine Spanish, uses vos conjugation, references shared context, and adjusts formality based on the topic -- technical discussions are precise, casual conversations are warm and humorous.

WhatsApp & Telegram Setup

Channel setup involves configuring the messaging platform API and connecting it to your OpenClaw gateway. Each platform has different requirements and capabilities:

CHANNEL

Two options: WhatsApp Business API (official, requires Meta Business verification, supports rich messages) or web bridge (unofficial, simpler setup, personal account). The web bridge works by connecting to WhatsApp Web's protocol.

CHANNEL

Create a bot via @BotFather, get the token, and configure it in OpenClaw. Telegram supports inline keyboards, file uploads, voice messages, and group chat -- all natively handled by the Telegram channel adapter.

Both channels support text, voice messages (transcribed via Whisper or similar), images (analyzed via vision models), and file attachments. The channel adapter normalizes all message types into a unified format before passing them to the agent.

For WhatsApp, voice messages are particularly powerful -- you can speak naturally to your AI assistant while driving or walking, and it responds with synthesized voice or text based on your preference.

Voice Interaction

OpenClaw supports full voice interaction: speech-to-text for incoming voice messages and text-to-speech for responses. The voice pipeline integrates with multiple providers:

Speech-to-Text -- OpenAI Whisper (local or API), Google Speech-to-Text, or Deepgram for real-time transcription
Text-to-Speech -- Qwen3-TTS (local, voice profiles), LuxTTS (voice cloning, 150x realtime), OpenAI TTS, or Google TTS for response synthesis
Voice mode -- Configure the agent to always respond with voice, only when receiving voice, or based on message length and context

Voice interactions feel natural because the agent maintains conversational context. You can start a voice thread about a topic, switch to text, and the agent seamlessly continues the conversation across modalities.

Skills Development

Skills are the building blocks of your agent's capabilities. Each skill is a self-contained module that the agent can invoke when needed. Skills can range from simple (weather lookup) to complex (multi-step browser automation with data extraction).

SKILL

Web Search

Search the web using DuckDuckGo, Google, or Brave APIs. The agent summarizes results and can follow links to extract detailed information from specific pages.

SKILL

Calendar Management

Integrate with Google Calendar to check availability, create events, send invitations, and provide daily briefings. The agent can negotiate meeting times across time zones.

SKILL

File Operations

Read, write, and manage files on the node's filesystem. Generate reports, process CSVs, create documents, and send them back through the messaging channel.

SKILL

System Monitoring

Check server health, disk usage, running processes, and Docker containers. Receive proactive alerts when resources exceed thresholds and get troubleshooting suggestions.

Creating a custom skill involves defining a tool schema (name, description, parameters) and an execution handler. Skills are registered in AGENTS.md and can declare dependencies on other skills or external APIs.

Cron Jobs & Automation

OpenClaw's cron system lets your agent perform tasks on a schedule without any user interaction. Cron jobs are defined in the configuration and can trigger any skill or sequence of skills:

Daily briefings -- Morning summary of weather, calendar, news, and pending tasks sent to your WhatsApp
Health reminders -- Supplement reminders, water intake alerts, workout prompts at configured times
Monitoring checks -- Hourly server health reports, SSL certificate expiry warnings, uptime monitoring
Data collection -- Scrape prices, track deliveries, monitor stock levels, and report changes

Cron jobs use standard cron syntax and can be conditional -- for example, only send the morning briefing on weekdays, or only alert about server issues if CPU exceeds 80%.

Browser Automation

Browser automation gives your agent the ability to interact with any website as a user would. Built on Puppeteer, the browser skill can navigate pages, fill forms, click buttons, extract data, and take screenshots.

Common use cases include:

Filling out repetitive forms (government portals, registrations)
Extracting data from websites that lack APIs
Monitoring price changes on e-commerce sites
Taking screenshots for visual verification or reporting
Automating multi-step web workflows (booking, ordering, submissions)

The agent can combine browser automation with its language understanding -- you describe what you need in natural language, and it figures out which buttons to click, which forms to fill, and how to navigate complex multi-page workflows.

Canvas UI

The Canvas UI is an optional web interface that provides rich, visual interactions beyond what messaging apps can display. It connects to the same gateway and shares context with your messaging channels, so you can start a conversation on WhatsApp and continue exploring results on the canvas.

Key capabilities include:

Interactive data tables and filterable lists for structured information
Chart rendering for time-series data, health metrics, or financial tracking
Form-based workflows for multi-step data entry that is awkward via chat
Code editor panels with syntax highlighting for reviewing and editing files
Screenshot previews and visual diff views from browser automation results

The canvas is served by the gateway on a configurable port and is protected by the same authentication layer. It is fully optional -- your agent works perfectly fine through messaging channels alone.

Hooks System

OpenClaw's hooks system provides event-driven extensibility. Hooks let you run custom logic before or after key events in the agent lifecycle, without modifying the core codebase. They are defined in the node configuration and can be written in JavaScript or as shell scripts.

Available hook events include:

onMessageReceived -- Triggered before the agent processes a message. Use it for logging, content filtering, or routing overrides
onResponseReady -- Triggered after the agent generates a response, before it is sent. Use it for output sanitization, translation, or analytics
onSkillExecuted -- Triggered after a skill completes. Use it for audit trails, cost tracking, or chaining follow-up actions
onNodeConnect / onNodeDisconnect -- Triggered when a node joins or leaves the gateway. Use it for alerting, failover logic, or resource rebalancing
onCronTick -- Triggered on each cron evaluation cycle. Use it for dynamic scheduling or conditional job suppression

Hooks receive a context object with the event payload and can modify it in-place (for pre-hooks) or trigger side effects (for post-hooks). They run synchronously in the order they are declared.

Environment Variables & .env

All infrastructure configuration in OpenClaw is managed through environment variables, typically loaded from a .env file at the project root. This keeps secrets out of version control and makes deployments portable across machines.

Key environment variables include:

# Core

ANTHROPIC_API_KEY=sk-ant-...        # Claude API key

GATEWAY_PORT=3100                    # Gateway HTTP/WS port

NODE_ID=msi-laptop-c381e7e7          # Unique node identifier

# Channels

WHATSAPP_TOKEN=...                   # WhatsApp web bridge token

TELEGRAM_BOT_TOKEN=...               # Telegram bot token from @BotFather

# Voice

WHISPER_SERVER_URL=http://localhost:9000  # Local Whisper endpoint

TTS_PROVIDER=qwen3-tts               # TTS provider (qwen3-tts, luxtts, openai, google)

TTS_SERVER_URL=http://localhost:8100  # Local TTS endpoint

# Browser

BROWSER_EXEC_PATH=/usr/bin/chromium  # Headless Chrome path

# Integrations

ATLASSIAN_TOKEN=...                  # Jira/Confluence API token

SQLITE_MEMORY_PATH=./memory.db      # SQLite memory database path

The .env.example file in the repository documents every variable with defaults and descriptions. Variables can also be set directly in the shell environment or through your deployment platform's secrets manager.

Node Pairing & Multi-Device

OpenClaw supports running multiple nodes connected to a single gateway. Each node is a separate machine (or process) that registers with the gateway using a pairing token. This lets you distribute workloads and capabilities across devices.

To pair a new node:

# On the gateway, generate a pairing token

openclaw gateway pair --generate

# Output: TOKEN=abc123...

# On the new node, register with the gateway

openclaw node pair --token abc123... --gateway wss://your-gateway:3100

# The node is now connected and ready to receive tasks

Each node declares its capabilities (e.g., browser automation, file system, audio processing) during registration. The gateway uses this information to route tasks intelligently -- a browser task goes to the node with Chromium installed, a file operation goes to the node with filesystem access.

Multi-device setups are useful for separating concerns: run your personal node on your laptop for local file access, and a cloud node for always-on availability and heavy compute tasks. Both nodes share the same agent identity and memory.

Latest Updates (2026)

SECURITY

Security Hardening

Bootstrap tokens for secure node registration, timing-safe HMAC verification to prevent side-channel attacks, and disabled implicit plugin auto-load to eliminate untrusted code execution risks.

DOCKER

Docker Improvements

OPENCLAW_TZ environment variable for timezone support in containers, improved Windows compatibility for Docker Desktop, and compiled plugin support for faster cold starts.

MCP

Chrome DevTools MCP

Official attach mode for live Chrome browser sessions via the Model Context Protocol. Inspect, debug, and automate running browser tabs directly from your AI agent.

PLUGINS

ContextEngine Plugins

Pluggable context management with lifecycle hooks. Custom ContextEngine plugins can intercept, transform, and enrich conversation context at every stage of the agent pipeline.

PROVIDERS

New AI Providers

Vertex AI, Mistral (embeddings + voice), MiniMax, and Tavily now bundled as first-class providers. Switch models and capabilities without changing your agent configuration.

MOBILE

Mobile UI Redesign

Android grouped settings for streamlined configuration, and a new iOS welcome pager for guided onboarding. Both platforms feature improved navigation and accessibility.

ATTACHMENTS

PDF & Attachments

First-class PDF support with parsing, summarization, and inline rendering. Inline attachments in messages, and a broader SecretRef API for secure credential injection into skills.

PERFORMANCE

Performance

Model catalog caching for instant provider switching, lazy-loading of skills and plugins on first invocation, and dashboard optimizations reducing initial load time by up to 60%.

MODELS

GPT-5.4 & Fast Mode

Full OpenAI GPT-5.4 with 1.05M token context and 75% OSWorld benchmark. Session fast-mode for GPT-5.4 and Claude reduces latency on time-sensitive operations.

Dashboard v2 Enhancements

Modular control UI with overview, chat, config, agent, session views. Command palette, search, export, pinned messages. 40% faster setup.

CHANNELS

Telegram Topic-Level Routing

Topic-level agent isolation in Telegram groups. Different agents per topic. Persistent bindings auto-restore on Discord and Telegram after restarts.

ECOSYSTEM

Extended AI Provider Stack

xAI Grok integration. Ollama, vLLM, SGLang on plugin architecture. Provider-owned discovery for seamless ecosystem expansion.

v2026.4.9

Dreaming: AI Memory via REM Backfilling

New "Dreaming" feature allows AI agents to process historical data through REM backfilling, replaying user notes to form persistent memories. Includes journal timeline UI, enhanced SSRF and node execution injection protections, role atmosphere QA evaluation, and optimized Android pairing.

v2026.4.7

Inference, Media, and TaskFlows

New inference capabilities, music and video editing tools, conversation branching and restoration, Webhook-based TaskFlows, bundled Codex support, optional Active Memory plugin, local MLX speech for Talk Mode, and richer Teams/WhatsApp handling.

v2026.4.14

Claude Opus 4.7, Gemini TTS & Active Memory

Claude Opus 4.7 set as default model with bundled image understanding. Gemini text-to-speech in the bundled Google plugin with voice selection and WAV/PCM output. Active Memory sub-agent for pre-reply context enrichment with /verbose inspection. GPT-5.4-pro forward compatibility. Model Auth status card showing OAuth health and rate-limit pressure. 346K+ GitHub stars.

v2026.6.5

Monthly Release Trains & Hardened Recovery

OpenClaw moved to YYYY.M.PATCH monthly release trains, pinning the June 2026 floor at v2026.6.5 (June 9, 2026). MCP tool results now coerce resource_link, resource, audio, and malformed image blocks at the materialize boundary, preventing Anthropic 400s and poisoned session history. Anthropic extended-thinking sessions recover after prompt-cache expiry or Gateway restarts, Parallel web search ships bundled, and reasoning tags are stripped before messages reach chat channels.

ECOSYSTEM

Ecosystem & Agent Framework Integration

OpenClaw's skill ecosystem now counts 44K+ community skills on ClawHub, with 65%+ wrapping MCP servers. Standardized channel setup/secret contract for all bundled channels. For multi-agent orchestration beyond OpenClaw, see our guides on A2A Protocol, Google ADK, OpenAI Agents SDK, LangGraph, and n8n.

Supported Channels (20+)

OpenClaw now supports 20+ messaging channels, making it one of the most versatile AI agent frameworks for multi-platform deployment. Beyond the original WhatsApp and Telegram, the channel adapter system now includes: Slack, Discord, Google Chat, Signal, iMessage, BlueBubbles, IRC, Microsoft Teams, Matrix, LINE, Mattermost, Nextcloud Talk, Nostr, Twitch, and WebChat. Each channel adapter normalizes platform-specific features (reactions, threads, file types, rich messages) into OpenClaw's unified message format.

Docker Deployment

OpenClaw supports multi-stage Docker builds with minimal runtime images. The build stage compiles TypeScript and installs dependencies, while the runtime stage uses a slim Node.js base image with only production dependencies. This reduces the final image size significantly, speeds up container startup, and minimizes the attack surface for production deployments. Docker Compose configurations are provided for running the gateway, node agent, and voice services as a coordinated stack.

Multi-Agent Broadcast

OpenClaw supports configurable group broadcast dispatch with observer-session isolation and cross-account deduplication. When an agent is added to a group chat (Telegram, Discord, Slack), the broadcast system ensures that each message is processed exactly once, even when multiple agent instances observe the same group. Observer sessions are isolated -- each agent maintains its own context and memory for the group conversation without interfering with other agents.

Cross-account deduplication prevents duplicate responses when the same user interacts with the agent from different accounts or platforms. The broadcast dispatcher routes messages based on configurable rules: round-robin, capability-based, or affinity-based routing. This enables multi-agent setups where different agents specialize in different topics within the same group chat.

Real Example: Jose's Setup

Jose runs OpenClaw 2026.4.27 on an MSI Raider 18 HX as a production AI assistant accessible via WhatsApp and Telegram. This is not a demo -- it is a fully autonomous system with 8 always-on services, 4 automated daemons, and 6 local AI models running on an RTX 5090. Here are the exact details:

HARDWARE

MSI Raider 18 HX ($6,458)

RTX 5090 (24 GB GDDR7, 1824 AI TOPS), Intel Core Ultra 9 285HX, 64 GB DDR5, 6 TB NVMe. This machine runs 8 always-on services simultaneously.

SERVICES

8 systemd Services

openclaw-containers.service (Docker gateway on port 18789), openclaw-node.service (node agent "MSI-Laptop"), openclaw-fix-permissions.service (permission watcher), whisper-server.service (FastAPI Whisper on port 8787), voicebox.service (Qwen3-TTS on port 8100), luxtts.service (LuxTTS on port 8101). Auto-update runs daily at 10 AM COT.

MODELS

AI Models & Config

Primary: anthropic/claude-opus-4-6, fallback: openai/gpt-5.4. Channels: WhatsApp + Telegram. Memory: SQLite-backed. Browser: Chrome DevTools Protocol with CDP profiles. Agent timeout: 1800s (30 min). Sandbox: off (full system access).

VOICE

Local Voice Pipeline (RTX 5090)

All voice processing runs locally on the RTX 5090 -- zero cloud API calls. STT: Whisper large-v3 via faster-whisper (CUDA float16). TTS: Qwen3-TTS 1.7B with Jose's voice profile. Voice cloning: LuxTTS at 150x realtime, 48 kHz output. Custom skills: voicebox-tts and luxtts-voice-clone.

LOCAL AI

6 Local Models on RTX 5090

Whisper large-v3 (audio transcription), Qwen3-TTS 1.7B (text-to-speech), LuxTTS (voice cloning, 150x realtime), Qwen2.5-VL-7B-AWQ (vision-language for hCaptcha solving), Wan 2.1 I2V 14B (video generation), and Nemotron-nano-8b (LoRA fine-tuning with Unsloth). All running via CUDA on 24 GB GDDR7.

DAEMONS

4 Automated Daemons

Xiaomi scale tracker (354 measurements over 13 months of automated collection), Davivienda banking daemon (hCaptcha solver using local Qwen2.5-VL-7B via vLLM), currency exchange rate updater, and auto-deploy pipeline. All run autonomously on cron schedules via OpenClaw.

WORKSPACE

165-File Agent Workspace

The agent workspace contains 165 files including custom AGENTS.md, SOUL.md, IDENTITY.md, USER.md, TOOLS.md, MEMORY.md, and HEARTBEAT.md. The SOUL.md defines a bilingual (Spanish/English) assistant with Argentine communication style using vos conjugation and deep contextual awareness of Jose's projects.

ACHIEVEMENT

Overnight 8-Agent Run

8 concurrent Claude Code instances ran overnight while OpenClaw monitored progress via cron every 15 min. The task: tax-exempt billing verification across Peru, Chile, and Colombia. Result: 3 bugs fixed, 189 test suites (1,740 tests), 11 commits across 7 microservice repos.

COST

Monthly Investment: ~$526

AI subscriptions: ~$257/month (Claude Max $200, ChatGPT Pro $20, Gemini Pro $20, Perplexity Pro $204/year). Hardware amortized: ~$269/month ($6,458 over 24-month GPU upgrade cycle). ROI: 8 concurrent agents working overnight is equivalent to 8 junior devs for one night.

This production setup runs ~$526/month total (AI subscriptions + hardware amortization) with all voice and vision processing handled locally on the RTX 5090 -- zero cloud TTS/STT fees. The result is a fully autonomous AI assistant that runs 24/7, orchestrates overnight multi-agent coding sessions, and automates banking, health tracking, and deployments.

Book a free consult to deploy your AI assistant

More Guides

AI DevelopmentClaude Code: Complete Guide ArchitectureAI Agent Architecture: Multi-Agent Systems InfrastructureKubernetes & DevOps: Production-Grade DevelopmentFull-Stack Development with AI TrainingAI Engineering Training: 1-on-1 Sessions SDKClaude Agent SDK: Production-Grade AI Agents ProtocolMCP Servers: Connecting AI to Your Tools AutomationPuppeteer: Browser Automation and Testing EdgeCloudflare: Workers AI & MCP Servers for Agents OrchestrationPaperclip: Scale OpenClaw into AI Companies EcosystemOpenClaw Ecosystem: 9 Frameworks Compared

OpenClaw: Deploy Your Personal AI Assistant on WhatsApp & Telegram

Table of Contents

What Is OpenClaw?

Architecture: Gateway, Nodes, Channels

Gateway

Nodes

Channels

Voice I/O (Whisper)

Canvas UI

Node Pairing & Multi-Device

Installation & Configuration

Required API Keys

Node Configuration

Multi-Machine Setup

AGENTS.md / SOUL.md / MEMORY.md

AGENTS.md

SOUL.md

MEMORY.md

WhatsApp & Telegram Setup

WhatsApp

Telegram

Voice Interaction

Skills Development

Web Search

Calendar Management

File Operations

System Monitoring

Cron Jobs & Automation

Browser Automation

Canvas UI

Hooks System

Environment Variables & .env

Node Pairing & Multi-Device

Latest Updates (2026)

Security Hardening

Docker Improvements

Chrome DevTools MCP

ContextEngine Plugins

New AI Providers

Mobile UI Redesign

PDF & Attachments

Performance

GPT-5.4 & Fast Mode

Dashboard v2 Enhancements

Telegram Topic-Level Routing

Extended AI Provider Stack

Dreaming: AI Memory via REM Backfilling

Inference, Media, and TaskFlows

Claude Opus 4.7, Gemini TTS & Active Memory

Monthly Release Trains & Hardened Recovery

Ecosystem & Agent Framework Integration

Supported Channels (20+)

Docker Deployment

Multi-Agent Broadcast

Real Example: Jose's Setup

MSI Raider 18 HX ($6,458)

8 systemd Services

AI Models & Config

Local Voice Pipeline (RTX 5090)

6 Local Models on RTX 5090

4 Automated Daemons

165-File Agent Workspace

Overnight 8-Agent Run

Monthly Investment: ~$526

More Guides