OpenClaw: Deploy Your Personal AI Assistant on WhatsApp & Telegram
Build and deploy your own AI assistant that lives in WhatsApp and Telegram. Voice interaction, browser automation, cron jobs, custom skills, and full personality design through AGENTS.md, SOUL.md, and MEMORY.md.
By Jose Nobile | Updated 2026-04-20 | 18 min read
Table of Contents
- What Is OpenClaw?
- Architecture: Gateway, Nodes, Channels
- Installation & Configuration
- AGENTS.md / SOUL.md / MEMORY.md
- WhatsApp & Telegram Setup
- Voice Interaction
- Skills Development
- Cron Jobs & Automation
- Browser Automation
- Canvas UI
- Hooks System
- Environment Variables & .env
- Node Pairing & Multi-Device
- Latest Updates (2026)
- Real Example: Jose's Setup
What Is OpenClaw?
OpenClaw is an open-source framework for deploying personal AI assistants across messaging platforms. Unlike generic chatbots, OpenClaw creates a persistent, personality-driven AI that knows you, remembers context across conversations, and can execute real-world actions -- from browsing the web to managing your calendar.
It is built on a distributed architecture with a central gateway, multiple nodes (each running on different machines), and channels (WhatsApp, Telegram, Slack, etc.). This design means your AI assistant can run on your laptop, a VPS, or a Raspberry Pi, and you interact with it from any messaging app.
What makes OpenClaw unique is its personality system. Through AGENTS.md, SOUL.md, and MEMORY.md files, you define not just what your assistant can do, but who it is -- its communication style, values, knowledge areas, and behavioral boundaries.
Architecture: Gateway, Nodes, Channels
Gateway
The central routing layer that connects channels to nodes. The gateway handles message routing, authentication, session management, and load balancing. It can run as a lightweight process on any server.
Nodes
Execution environments where your AI agent runs. Each node can host different capabilities (browser automation, file system access, specific tools). Multiple nodes can serve a single agent for redundancy.
Channels
Messaging platform integrations. WhatsApp (via WhatsApp Business API or web bridge), Telegram Bot API, Slack, and more. Each channel adapter translates platform-specific messages into OpenClaw's unified format.
Voice I/O (Whisper)
Full audio pipeline: incoming voice messages are transcribed via a local or remote Whisper server, and responses can be synthesized back to audio. Use cases include hands-free interaction while driving, dictating notes, and accessibility for users who prefer speech over typing.
Canvas UI
An optional web-based canvas interface for rich interactions that go beyond text. The canvas renders interactive cards, data tables, charts, and forms directly in the browser, complementing the messaging channels with a visual workspace. Use cases include dashboard views, drag-and-drop task management, and visual data exploration.
Node Pairing & Multi-Device
Pair multiple nodes to a single gateway by exchanging a pairing token. Each device (laptop, VPS, Raspberry Pi) registers as a separate node with its own capabilities. The gateway routes tasks to the best available node. Use cases include running heavy browser tasks on a cloud node while keeping file access on a local machine.
The architecture supports multiple concurrent users, each with their own agent personality and memory. A single gateway can route messages to different nodes based on user identity, message type, or load conditions.
Nodes communicate with the gateway via WebSocket connections, enabling real-time message delivery and bidirectional streaming for voice interactions. The protocol is designed for unreliable networks -- messages are queued and retried if a node disconnects temporarily.
Installation & Configuration
OpenClaw runs on Node.js 24 (recommended) or 22.16+ and requires a few environment variables for API keys and channel credentials. The installation process is streamlined for developers familiar with npm:
cd openclaw
npm install
cp .env.example .env # Configure your API keys
npm start
Configuration is split into two areas: infrastructure (gateway ports, node addresses, channel credentials) in .env and the environment config files, and personality/behavior in the Markdown files (AGENTS.md, SOUL.md, MEMORY.md).
Required API Keys
Anthropic API key (for Claude), WhatsApp Business API or web bridge token, Telegram Bot token from @BotFather. Optional: local Whisper + Qwen3-TTS for voice, Browserbase or local Puppeteer for browser automation.
Node Configuration
Each node defines its capabilities, available tools, and resource limits. Nodes can be configured to handle specific types of requests (text, voice, browser tasks) or serve as general-purpose executors.
Multi-Machine Setup
For advanced deployments, run the gateway on a VPS with static IP, and nodes on local machines or cloud instances. The gateway handles NAT traversal and keeps connections alive across network changes.
AGENTS.md / SOUL.md / MEMORY.md
The personality system is what transforms a generic AI into your personal assistant. Three Markdown files define the complete behavioral profile:
AGENTS.md
Defines the agent's capabilities, available tools, skills registry, and operational boundaries. This is the "what can it do" file -- listing every skill, its trigger conditions, and execution parameters.
SOUL.md
Defines the agent's personality, communication style, values, and behavioral principles. This is the "who is it" file -- tone of voice, language preferences, humor style, empathy level, and response patterns.
MEMORY.md
Persistent memory that the agent reads and writes. Contains user preferences, past interactions, learned facts, and contextual notes. The agent updates this file as it learns about you over time.
A well-designed SOUL.md transforms the user experience dramatically. Jose's assistant speaks in natural Argentine Spanish, uses vos conjugation, references shared context, and adjusts formality based on the topic -- technical discussions are precise, casual conversations are warm and humorous.
WhatsApp & Telegram Setup
Channel setup involves configuring the messaging platform API and connecting it to your OpenClaw gateway. Each platform has different requirements and capabilities:
Two options: WhatsApp Business API (official, requires Meta Business verification, supports rich messages) or web bridge (unofficial, simpler setup, personal account). The web bridge works by connecting to WhatsApp Web's protocol.
Telegram
Create a bot via @BotFather, get the token, and configure it in OpenClaw. Telegram supports inline keyboards, file uploads, voice messages, and group chat -- all natively handled by the Telegram channel adapter.
Both channels support text, voice messages (transcribed via Whisper or similar), images (analyzed via vision models), and file attachments. The channel adapter normalizes all message types into a unified format before passing them to the agent.
For WhatsApp, voice messages are particularly powerful -- you can speak naturally to your AI assistant while driving or walking, and it responds with synthesized voice or text based on your preference.
Voice Interaction
OpenClaw supports full voice interaction: speech-to-text for incoming voice messages and text-to-speech for responses. The voice pipeline integrates with multiple providers:
- Speech-to-Text -- OpenAI Whisper (local or API), Google Speech-to-Text, or Deepgram for real-time transcription
- Text-to-Speech -- Qwen3-TTS (local, voice profiles), LuxTTS (voice cloning, 150x realtime), OpenAI TTS, or Google TTS for response synthesis
- Voice mode -- Configure the agent to always respond with voice, only when receiving voice, or based on message length and context
Voice interactions feel natural because the agent maintains conversational context. You can start a voice thread about a topic, switch to text, and the agent seamlessly continues the conversation across modalities.
Skills Development
Skills are the building blocks of your agent's capabilities. Each skill is a self-contained module that the agent can invoke when needed. Skills can range from simple (weather lookup) to complex (multi-step browser automation with data extraction).
Web Search
Search the web using DuckDuckGo, Google, or Brave APIs. The agent summarizes results and can follow links to extract detailed information from specific pages.
Calendar Management
Integrate with Google Calendar to check availability, create events, send invitations, and provide daily briefings. The agent can negotiate meeting times across time zones.
File Operations
Read, write, and manage files on the node's filesystem. Generate reports, process CSVs, create documents, and send them back through the messaging channel.
System Monitoring
Check server health, disk usage, running processes, and Docker containers. Receive proactive alerts when resources exceed thresholds and get troubleshooting suggestions.
Creating a custom skill involves defining a tool schema (name, description, parameters) and an execution handler. Skills are registered in AGENTS.md and can declare dependencies on other skills or external APIs.
Cron Jobs & Automation
OpenClaw's cron system lets your agent perform tasks on a schedule without any user interaction. Cron jobs are defined in the configuration and can trigger any skill or sequence of skills:
- Daily briefings -- Morning summary of weather, calendar, news, and pending tasks sent to your WhatsApp
- Health reminders -- Supplement reminders, water intake alerts, workout prompts at configured times
- Monitoring checks -- Hourly server health reports, SSL certificate expiry warnings, uptime monitoring
- Data collection -- Scrape prices, track deliveries, monitor stock levels, and report changes
Cron jobs use standard cron syntax and can be conditional -- for example, only send the morning briefing on weekdays, or only alert about server issues if CPU exceeds 80%.
Browser Automation
Browser automation gives your agent the ability to interact with any website as a user would. Built on Puppeteer, the browser skill can navigate pages, fill forms, click buttons, extract data, and take screenshots.
Common use cases include:
- Filling out repetitive forms (government portals, registrations)
- Extracting data from websites that lack APIs
- Monitoring price changes on e-commerce sites
- Taking screenshots for visual verification or reporting
- Automating multi-step web workflows (booking, ordering, submissions)
The agent can combine browser automation with its language understanding -- you describe what you need in natural language, and it figures out which buttons to click, which forms to fill, and how to navigate complex multi-page workflows.
Canvas UI
The Canvas UI is an optional web interface that provides rich, visual interactions beyond what messaging apps can display. It connects to the same gateway and shares context with your messaging channels, so you can start a conversation on WhatsApp and continue exploring results on the canvas.
Key capabilities include:
- Interactive data tables and filterable lists for structured information
- Chart rendering for time-series data, health metrics, or financial tracking
- Form-based workflows for multi-step data entry that is awkward via chat
- Code editor panels with syntax highlighting for reviewing and editing files
- Screenshot previews and visual diff views from browser automation results
The canvas is served by the gateway on a configurable port and is protected by the same authentication layer. It is fully optional -- your agent works perfectly fine through messaging channels alone.
Hooks System
OpenClaw's hooks system provides event-driven extensibility. Hooks let you run custom logic before or after key events in the agent lifecycle, without modifying the core codebase. They are defined in the node configuration and can be written in JavaScript or as shell scripts.
Available hook events include:
- onMessageReceived -- Triggered before the agent processes a message. Use it for logging, content filtering, or routing overrides
- onResponseReady -- Triggered after the agent generates a response, before it is sent. Use it for output sanitization, translation, or analytics
- onSkillExecuted -- Triggered after a skill completes. Use it for audit trails, cost tracking, or chaining follow-up actions
- onNodeConnect / onNodeDisconnect -- Triggered when a node joins or leaves the gateway. Use it for alerting, failover logic, or resource rebalancing
- onCronTick -- Triggered on each cron evaluation cycle. Use it for dynamic scheduling or conditional job suppression
Hooks receive a context object with the event payload and can modify it in-place (for pre-hooks) or trigger side effects (for post-hooks). They run synchronously in the order they are declared.
Environment Variables & .env
All infrastructure configuration in OpenClaw is managed through environment variables, typically loaded from a .env file at the project root. This keeps secrets out of version control and makes deployments portable across machines.
Key environment variables include:
ANTHROPIC_API_KEY=sk-ant-... # Claude API key
GATEWAY_PORT=3100 # Gateway HTTP/WS port
NODE_ID=msi-laptop-c381e7e7 # Unique node identifier
# Channels
WHATSAPP_TOKEN=... # WhatsApp web bridge token
TELEGRAM_BOT_TOKEN=... # Telegram bot token from @BotFather
# Voice
WHISPER_SERVER_URL=http://localhost:9000 # Local Whisper endpoint
TTS_PROVIDER=qwen3-tts # TTS provider (qwen3-tts, luxtts, openai, google)
TTS_SERVER_URL=http://localhost:8100 # Local TTS endpoint
# Browser
BROWSER_EXEC_PATH=/usr/bin/chromium # Headless Chrome path
# Integrations
ATLASSIAN_TOKEN=... # Jira/Confluence API token
SQLITE_MEMORY_PATH=./memory.db # SQLite memory database path
The .env.example file in the repository documents every variable with defaults and descriptions. Variables can also be set directly in the shell environment or through your deployment platform's secrets manager.
Node Pairing & Multi-Device
OpenClaw supports running multiple nodes connected to a single gateway. Each node is a separate machine (or process) that registers with the gateway using a pairing token. This lets you distribute workloads and capabilities across devices.
To pair a new node:
openclaw gateway pair --generate
# Output: TOKEN=abc123...
# On the new node, register with the gateway
openclaw node pair --token abc123... --gateway wss://your-gateway:3100
# The node is now connected and ready to receive tasks
Each node declares its capabilities (e.g., browser automation, file system, audio processing) during registration. The gateway uses this information to route tasks intelligently -- a browser task goes to the node with Chromium installed, a file operation goes to the node with filesystem access.
Multi-device setups are useful for separating concerns: run your personal node on your laptop for local file access, and a cloud node for always-on availability and heavy compute tasks. Both nodes share the same agent identity and memory.
Latest Updates (2026)
Security Hardening
Bootstrap tokens for secure node registration, timing-safe HMAC verification to prevent side-channel attacks, and disabled implicit plugin auto-load to eliminate untrusted code execution risks.
Docker Improvements
OPENCLAW_TZ environment variable for timezone support in containers, improved Windows compatibility for Docker Desktop, and compiled plugin support for faster cold starts.
Chrome DevTools MCP
Official attach mode for live Chrome browser sessions via the Model Context Protocol. Inspect, debug, and automate running browser tabs directly from your AI agent.
ContextEngine Plugins
Pluggable context management with lifecycle hooks. Custom ContextEngine plugins can intercept, transform, and enrich conversation context at every stage of the agent pipeline.
New AI Providers
Vertex AI, Mistral (embeddings + voice), MiniMax, and Tavily now bundled as first-class providers. Switch models and capabilities without changing your agent configuration.
Mobile UI Redesign
Android grouped settings for streamlined configuration, and a new iOS welcome pager for guided onboarding. Both platforms feature improved navigation and accessibility.
PDF & Attachments
First-class PDF support with parsing, summarization, and inline rendering. Inline attachments in messages, and a broader SecretRef API for secure credential injection into skills.
Performance
Model catalog caching for instant provider switching, lazy-loading of skills and plugins on first invocation, and dashboard optimizations reducing initial load time by up to 60%.
GPT-5.4 & Fast Mode
Full OpenAI GPT-5.4 with 1.05M token context and 75% OSWorld benchmark. Session fast-mode for GPT-5.4 and Claude reduces latency on time-sensitive operations.
Dashboard v2 Enhancements
Modular control UI with overview, chat, config, agent, session views. Command palette, search, export, pinned messages. 40% faster setup.
Telegram Topic-Level Routing
Topic-level agent isolation in Telegram groups. Different agents per topic. Persistent bindings auto-restore on Discord and Telegram after restarts.
Extended AI Provider Stack
xAI Grok integration. Ollama, vLLM, SGLang on plugin architecture. Provider-owned discovery for seamless ecosystem expansion.
Dreaming: AI Memory via REM Backfilling
New "Dreaming" feature allows AI agents to process historical data through REM backfilling, replaying user notes to form persistent memories. Includes journal timeline UI, enhanced SSRF and node execution injection protections, role atmosphere QA evaluation, and optimized Android pairing.
Inference, Media, and TaskFlows
New inference capabilities, music and video editing tools, conversation branching and restoration, Webhook-based TaskFlows, bundled Codex support, optional Active Memory plugin, local MLX speech for Talk Mode, and richer Teams/WhatsApp handling.
Claude Opus 4.7, Gemini TTS & Active Memory
Claude Opus 4.7 set as default model with bundled image understanding. Gemini text-to-speech in the bundled Google plugin with voice selection and WAV/PCM output. Active Memory sub-agent for pre-reply context enrichment with /verbose inspection. GPT-5.4-pro forward compatibility. Model Auth status card showing OAuth health and rate-limit pressure. 346K+ GitHub stars.
Ecosystem & Agent Framework Integration
OpenClaw's skill ecosystem now counts 44K+ community skills on ClawHub, with 65%+ wrapping MCP servers. Standardized channel setup/secret contract for all bundled channels. For multi-agent orchestration beyond OpenClaw, see our guides on A2A Protocol, Google ADK, OpenAI Agents SDK, LangGraph, and n8n.
Supported Channels (20+)
OpenClaw now supports 20+ messaging channels, making it one of the most versatile AI agent frameworks for multi-platform deployment. Beyond the original WhatsApp and Telegram, the channel adapter system now includes: Slack, Discord, Google Chat, Signal, iMessage, BlueBubbles, IRC, Microsoft Teams, Matrix, LINE, Mattermost, Nextcloud Talk, Nostr, Twitch, and WebChat. Each channel adapter normalizes platform-specific features (reactions, threads, file types, rich messages) into OpenClaw's unified message format.
Docker Deployment
OpenClaw supports multi-stage Docker builds with minimal runtime images. The build stage compiles TypeScript and installs dependencies, while the runtime stage uses a slim Node.js base image with only production dependencies. This reduces the final image size significantly, speeds up container startup, and minimizes the attack surface for production deployments. Docker Compose configurations are provided for running the gateway, node agent, and voice services as a coordinated stack.
Multi-Agent Broadcast
OpenClaw supports configurable group broadcast dispatch with observer-session isolation and cross-account deduplication. When an agent is added to a group chat (Telegram, Discord, Slack), the broadcast system ensures that each message is processed exactly once, even when multiple agent instances observe the same group. Observer sessions are isolated -- each agent maintains its own context and memory for the group conversation without interfering with other agents.
Cross-account deduplication prevents duplicate responses when the same user interacts with the agent from different accounts or platforms. The broadcast dispatcher routes messages based on configurable rules: round-robin, capability-based, or affinity-based routing. This enables multi-agent setups where different agents specialize in different topics within the same group chat.
Real Example: Jose's Setup
Jose runs OpenClaw 2026.4.14 on an MSI Raider 18 HX as a production AI assistant accessible via WhatsApp and Telegram. This is not a demo -- it is a fully autonomous system with 8 always-on services, 4 automated daemons, and 6 local AI models running on an RTX 5090. Here are the exact details:
MSI Raider 18 HX ($6,458)
RTX 5090 (24 GB GDDR7, 1824 AI TOPS), Intel Core Ultra 9 285HX, 64 GB DDR5, 6 TB NVMe. This machine runs 8 always-on services simultaneously.
8 systemd Services
openclaw-containers.service (Docker gateway on port 18789), openclaw-node.service (node agent "MSI-Laptop"), openclaw-fix-permissions.service (permission watcher), whisper-server.service (FastAPI Whisper on port 8787), voicebox.service (Qwen3-TTS on port 8100), luxtts.service (LuxTTS on port 8101). Auto-update runs daily at 10 AM COT.
AI Models & Config
Primary: anthropic/claude-opus-4-6, fallback: openai/gpt-5.4. Channels: WhatsApp + Telegram. Memory: SQLite-backed. Browser: Chrome DevTools Protocol with CDP profiles. Agent timeout: 1800s (30 min). Sandbox: off (full system access).
Local Voice Pipeline (RTX 5090)
All voice processing runs locally on the RTX 5090 -- zero cloud API calls. STT: Whisper large-v3 via faster-whisper (CUDA float16). TTS: Qwen3-TTS 1.7B with Jose's voice profile. Voice cloning: LuxTTS at 150x realtime, 48 kHz output. Custom skills: voicebox-tts and luxtts-voice-clone.
6 Local Models on RTX 5090
Whisper large-v3 (audio transcription), Qwen3-TTS 1.7B (text-to-speech), LuxTTS (voice cloning, 150x realtime), Qwen2.5-VL-7B-AWQ (vision-language for hCaptcha solving), Wan 2.1 I2V 14B (video generation), and Nemotron-nano-8b (LoRA fine-tuning with Unsloth). All running via CUDA on 24 GB GDDR7.
4 Automated Daemons
Xiaomi scale tracker (354 measurements over 13 months of automated collection), Davivienda banking daemon (hCaptcha solver using local Qwen2.5-VL-7B via vLLM), currency exchange rate updater, and auto-deploy pipeline. All run autonomously on cron schedules via OpenClaw.
165-File Agent Workspace
The agent workspace contains 165 files including custom AGENTS.md, SOUL.md, IDENTITY.md, USER.md, TOOLS.md, MEMORY.md, and HEARTBEAT.md. The SOUL.md defines a bilingual (Spanish/English) assistant with Argentine communication style using vos conjugation and deep contextual awareness of Jose's projects.
Overnight 8-Agent Run
8 concurrent Claude Code instances ran overnight while OpenClaw monitored progress via cron every 15 min. The task: tax-exempt billing verification across Peru, Chile, and Colombia. Result: 3 bugs fixed, 189 test suites (1,740 tests), 11 commits across 7 microservice repos.
Monthly Investment: ~$526
AI subscriptions: ~$257/month (Claude Max $200, ChatGPT Pro $20, Gemini Pro $20, Perplexity Pro $204/year). Hardware amortized: ~$269/month ($6,458 over 24-month GPU upgrade cycle). ROI: 8 concurrent agents working overnight is equivalent to 8 junior devs for one night.
This production setup runs ~$526/month total (AI subscriptions + hardware amortization) with all voice and vision processing handled locally on the RTX 5090 -- zero cloud TTS/STT fees. The result is a fully autonomous AI assistant that runs 24/7, orchestrates overnight multi-agent coding sessions, and automates banking, health tracking, and deployments.