Published:
Tagged: AI OpenClaw Vonage Voice SMS Conversational-AI JavaScript
What if your OpenClaw bot was as reachable as a friend? No app, no browser, just pick up the phone and call, or send a quick text and get a reply.
With the Vonage Unofficial skill for OpenClaw, any phone becomes an interface to your bot. Call a number and have a spoken conversation, or send an SMS and get a thoughtful reply within seconds. It works on every phone, everywhere, and it’s all handled by a single server.
In a world of apps and chat platforms, phone calls and texts might seem old-school. But they have unique advantages:
Voice is great for longer interactions, dictating notes, or when you simply can’t type. SMS is perfect for quick questions, reminders, or getting a brief answer when you don’t want to open a screen. Together, they cover almost every situation where a traditional app falls short.
A single Node.js webhook server handles both voice and SMS. Vonage handles the telephony and messaging infrastructure, the server sits in the middle, and OpenClaw provides the AI.
Voice flow:
SMS flow:
Response time for both is typically 3–5 seconds, fast enough that it feels natural.
Before you can use the Vonage skill, you need OpenClaw running on a server with a public IP that Vonage can reach. A VPS (Virtual Private Server) is the most common choice; providers like Hetzner, DigitalOcean, or Oracle Cloud all work well. Oracle Cloud’s Always Free tier is a solid option if you want to experiment at zero cost.
SSH into your server and run:
curl -fsSL https://openclaw.ai/install.sh | bash
Then run the onboarding wizard:
openclaw onboard --install-daemon
This walks you through configuring your AI provider (e.g. Anthropic, OpenAI), authentication, and gateway settings. Once complete, verify the gateway is running:
openclaw gateway status
The skill communicates with OpenClaw through its chat completions HTTP endpoint. Enable it:
openclaw config set gateway.http.endpoints.chatCompletions.enabled true
Note the gateway URL (http://127.0.0.1:18789 by default) and your gateway token as you’ll need them when configuring the webhook server.
Open the ports you’ll need. Port 62529 is for the webhook server (And if you’re wondering where that number came from? It’s “oclaw” on a phone keypad):
sudo ufw allow OpenSSH
sudo ufw allow 62529/tcp
sudo ufw enable
Your OpenClaw instance is now ready to serve as the backend for the skill.
Head to the Vonage Dashboard:
http://<your-server-ip>:62529/webhooks/answer (POST)http://<your-server-ip>:62529/webhooks/event (POST)http://<your-server-ip>:62529/webhooks/inboundhttp://<your-server-ip>:62529/webhooks/statusThat last step trips people up; if you skip it, inbound SMS messages won’t reach your webhook.
Save the Application ID, the private key, and note your phone number since you’ll need all three during setup.
Clone the skill into your OpenClaw skills directory:
git clone https://github.com/pardel/vonage-unofficial-skill ./skills/vonage-unofficial
Then restart or reload the OpenClaw runtime so it picks up the new skill.
The skill includes a setup script that generates a complete Node.js project. It will prompt you for your Vonage credentials and private key. The OpenClaw gateway URL and token are detected automatically, and your server’s public IP is picked up too, so most prompts will already have the right defaults:
./skills/vonage-unofficial/scripts/setup.sh ~/code/vonage
cd ~/code/vonage && node server.js
Call your number. Text your number. Both should work.
The first time you call and hear your agent respond to your voice, it clicks: this is how AI should work sometimes. Not everything needs a screen.
Some things that work well over voice:
The server has a few knobs you can adjust in server.js:
endOnSilence (default: 2s): How long to wait after you stop speaking. Lower means faster responses but might cut you off mid-pause.startTimeout (default: 20s): How long to wait for you to start speaking before giving up.maxDuration (default: 60s): Maximum length of a single utterance.language (default: en-GB): Change to match your accent for better recognition.The server maintains conversation history per phone number. When you text back and forth, the agent remembers the context, just like a real text conversation.
History is kept for 2 hours of inactivity, then cleared. This keeps things lightweight while allowing natural multi-turn conversations.
The system prompt instructs the agent to keep replies concise (aiming for ~160 characters per segment), because nobody wants to read an essay over SMS.
The server isn’t just reactive, it can send messages too. There’s a built-in /send endpoint:
curl -X POST http://localhost:62529/send \
-H 'Content-Type: application/json' \
-d '{"to": "447700900001", "text": "Hey! Just a reminder about your meeting at 3pm."}'
This is useful for building reminder systems, alerts, or having your agent reach out when something important happens.
The server logs everything to stdout and vonage.log with clear tags, making debugging straightforward:
Voice:
[2026-02-10T11:25:02Z] [ANSWER] from=447700900000 to=447700900001 conv=CON-abc123
[2026-02-10T11:25:08Z] [TRANSCRIPT] conv=CON-abc123 "What's the weather like today?"
[2026-02-10T11:25:11Z] [CLAW-REPLY] conv=CON-abc123 elapsed=2834ms reply="It's about 8 degrees and cloudy..."
SMS:
[2026-02-10T13:55:51Z] [INBOUND] from=447700900001 text="Are you getting my texts?"
[2026-02-10T13:55:54Z] [CLAW-REPLY] from=447700900001 elapsed=2875ms reply="Yep, loud and clear"
[2026-02-10T13:55:55Z] [SMS-OK] to=447700900001 messageId=d03b41e0-...
Vonage pricing varies by country, but it’s affordable for personal use. In the UK, for example:
For typical usage, a handful of calls and texts per day, you’re looking at well under €5/month total.
The phone is the most universal computing device on the planet. Connecting your AI agent to it, by voice and text, opens up possibilities that app-based interfaces simply can’t match.
The vonage-unofficial skill is available on GitHub. Clone it into your OpenClaw skills directory to get started.