OpenClaw (MoltBot) Setup Guide: Ubuntu + NVIDIA + Ollama + GLM-4.7 + Telegram
This guide explains how to deploy OpenClaw (formerly MoltBot) on bare-metal Ubuntu with an NVIDIA GPU, using Ollama as the local LLM server and GLM-4.7 as the language model, connected to Telegram for chat access. The result is a fully private, zero-cost AI assistant that runs on your own hardware.
Key Takeaways
- OpenClaw was renamed from MoltBot in January 2026. The CLI command is
openclaw; themoltbotalias still works. - The full setup requires Ubuntu 22.04/24.04, an NVIDIA GPU (8 GB+ VRAM), Node.js 20+, and a Telegram account.
- Ollama manages model downloads, serving, and the OpenAI-compatible REST API that OpenClaw uses.
- GLM-4.7-Flash (quantised) fits on 8–12 GB VRAM and generates 120–220 tokens/second on an RTX 4090.
- A 64k context window is recommended for OpenClaw’s agent layer — configure this via a custom Ollama model.
- The
openclaw onboardwizard handles all configuration: LLM provider, Telegram bot token, and gateway startup. - Once running, all inference is local — no data leaves your machine, no API billing applies.
What Is OpenClaw?
OpenClaw is an open-source personal AI gateway that connects large language models to messaging platforms — Telegram, WhatsApp, Slack, Discord, Signal, iMessage, and others. It manages the gateway process, routes messages through a configurable agent layer, maintains conversation context, and handles channel authentication.
OpenClaw differs from a standard chatbot in three ways. First, it is a persistent gateway — it runs as a background service, not a one-off script. Second, it is channel-agnostic — the same local LLM backend serves multiple messaging platforms simultaneously. Third, it is tool-aware — the agent layer can call external tools and APIs on behalf of the user.
OpenClaw was previously known as MoltBot (and before that, ClawdBot). The January 2026 rename reflects the project’s maturity and expanded feature set. The software is identical; only the name and primary CLI command changed.
Full blog post: https://musketeerstech.com/blogs/openclaw-setup-ubuntu-nvidia-ollama-telegram/
System Requirements
Hardware minimums:
- NVIDIA GPU with 8 GB VRAM minimum (RTX 3060 or better). 16–24 GB VRAM for the full GLM-4.7 model.
- 16 GB system RAM (32 GB recommended).
- 50 GB free disk space for OS, Ollama installation, and model weights.
Software requirements:
- Ubuntu 22.04 LTS or 24.04 LTS.
- Node.js 20 or later (required by the OpenClaw CLI).
- NVIDIA drivers (version 550 or later recommended).
- CUDA Toolkit (optional — Ollama ships its own CUDA libraries).
Accounts required:
- Telegram account (for bot creation via BotFather).
GLM-4.7 vs GLM-4.7-Flash
GLM-4.7 is the full reasoning model. It requires 16–24 GB VRAM and produces the highest benchmark scores for coding and reasoning tasks. GLM-4.7-Flash is a quantised variant that fits on 8–12 GB VRAM with 4-bit quantisation. On an RTX 4090, Flash generates 120–220 tokens per second after warmup, with first-token latency of 250–400 ms.
For Telegram assistant workloads — short prompts, conversational replies, code snippets — GLM-4.7-Flash is the appropriate choice. The full GLM-4.7 model is better for long documents, complex reasoning chains, and batch tasks where quality matters more than speed.
Step 1: Install Ubuntu and NVIDIA Drivers
Install Ubuntu 22.04 LTS or 24.04 LTS on the target machine. After installation:
sudo apt update && sudo apt upgrade -y
sudo reboot
sudo ubuntu-drivers autoinstall
sudo reboot
nvidia-smi
The nvidia-smi command confirms the GPU is recognised. Output includes GPU model, driver version, and current VRAM usage. If nvidia-smi fails, run sudo ubuntu-drivers devices to list available drivers and install the recommended version manually (e.g., sudo apt install nvidia-driver-550).
CUDA Toolkit installation is optional for the OpenClaw + Ollama stack. Ollama ships its own CUDA libraries. Install the Toolkit only if you plan to run additional CUDA workloads alongside Ollama.
Step 2: Install Ollama and Pull GLM-4.7
Ollama is a local model server that handles model downloads, quantisation management, context window configuration, and an OpenAI-compatible REST API.
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
systemctl status ollama
Pull the GLM-4.7-Flash model:
ollama pull glm-4.7-flash
This downloads approximately 5–6 GB of model weights. Confirm the model is working:
ollama run glm-4.7-flash "Say hello in one sentence."
Exit with /bye. If the model responds correctly, Ollama is functioning. Verify GPU acceleration by opening a second terminal and checking nvidia-smi while the model is running — VRAM usage should increase from near-zero to several gigabytes.
Configure a 64k Context Window
OpenClaw’s agent layer benefits from a large context window. Create a custom Ollama model with extended context:
ollama create glm-4.7-flash-ctx -f - <<EOF
FROM glm-4.7-flash
PARAMETER num_ctx 65536
EOF
Use glm-4.7-flash-ctx as the model identifier in the OpenClaw configuration. This allows OpenClaw to maintain longer conversation histories and handle multi-turn tool calls reliably.
Step 3: Install OpenClaw
The OpenClaw CLI is distributed as an npm package:
npm install -g openclaw
openclaw --version
The moltbot command alias is preserved for backward compatibility. Both openclaw and moltbot point to the same binary after installation.
If migrating from an existing MoltBot installation, run:
openclaw migrate
This copies the configuration from ~/.moltbot/config.yaml to ~/.openclaw/config.yaml while preserving all settings.
Step 4: Run the Onboarding Wizard
openclaw onboard
The wizard completes five configuration stages:
- Gateway initialisation — Creates
~/.openclaw/config.yamland registers the local device key. - LLM provider selection — Select “Ollama” from the provider list.
- Ollama endpoint — Default is
http://localhost:11434. Change this if Ollama runs on a remote host. - Model selection — Enter
glm-4.7-flash-ctx(the custom context-extended model created in Step 2). - Channel setup — Select Telegram. The wizard prompts for a bot token.
Do not manually edit ~/.openclaw/config.yaml before running openclaw onboard at least once. The wizard registers the device key during initialisation; skipping it leaves the gateway unregistered.
Step 5: Create a Telegram Bot
Telegram bots are managed through the BotFather bot.
- Open Telegram and search for @BotFather.
- Send
/newbot. - Provide a display name (e.g., “My OpenClaw Assistant”) and a username ending in
bot(e.g.,myopenclaw_bot). - Copy the API token returned by BotFather (format:
123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefgh).
Return to the openclaw onboard wizard and paste this token when prompted for the Telegram bot token.
Step 6: Connect Telegram and Start the Gateway
The wizard writes the Telegram configuration to ~/.openclaw/config.yaml:
channels:
telegram:
enabled: true
botToken: "123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefgh"
dmPolicy: whitelist
whitelist:
- YOUR_TELEGRAM_USER_ID
Find your Telegram user ID by messaging @userinfobot — it returns your numeric ID. Add this ID to the whitelist to restrict the bot to your account only.
Restart the gateway and verify status:
openclaw gateway restart
openclaw status
Successful output shows: gateway: running, provider: ollama, channels.telegram: connected.
Verification
Send a message to your Telegram bot. The first response may take 5–15 seconds (model warmup). Subsequent responses arrive within 1–3 seconds on typical hardware.
Test cases to verify the full pipeline:
- Basic chat: Confirm the model responds coherently and does not reference any cloud API.
- GPU utilisation: Check
nvidia-smiduring inference — VRAM usage should spike, confirming local GPU execution. - Context retention: Send follow-up questions referencing earlier messages — the 64k context window should maintain the conversation correctly.
Troubleshooting Reference
Gateway fails to start: Verify Node.js 20+ is installed (node --version), confirm Ollama is running (systemctl status ollama), and check logs (openclaw logs --tail 50).
Telegram bot receives no response: Confirm the bot token in ~/.openclaw/config.yaml is correct. Verify your Telegram user ID is in the whitelist if dmPolicy: whitelist is set. Test Ollama directly: curl http://localhost:11434/api/tags.
Ollama runs on CPU instead of GPU: Run nvidia-smi to confirm the GPU is visible. Check ollama logs for CUDA initialisation errors. Ensure Ollama is not running inside a container without GPU passthrough.
Slow responses: First responses after model load are always slow (warmup). If all responses are slow, check nvidia-smi dmon — if GPU utilisation stays below 30%, the model may be exceeding VRAM and spilling to system RAM. Switch to glm-4.7-flash if using the full model on a card with under 16 GB VRAM.
Cloud AI vs Local OpenClaw: Key Differences
| Dimension | Cloud AI (ChatGPT/Claude) | OpenClaw + Ollama |
|---|---|---|
| Cost | $20–$200+/month at scale | Zero ongoing cost |
| Rate limits | Yes (API quotas) | None (GPU is the ceiling) |
| Data privacy | Prompts sent to vendor | Never leaves your machine |
| Offline capability | No | Yes (after model pull) |
| Model flexibility | Vendor-controlled | Any Ollama-compatible model |
| Setup complexity | Low | Medium (one-time) |
Extending OpenClaw
OpenClaw supports additional channels beyond Telegram. WhatsApp, Slack, Discord, Google Chat, Signal, and iMessage are all configurable through the same ~/.openclaw/config.yaml file or via openclaw onboard. Each channel requires its own credentials.
OpenClaw integrates with the Model Context Protocol (MCP) for tool connectivity — external APIs, file systems, databases, and web search can all be connected as tools available to the agent layer. See the MCP documentation for configuration details.
Custom skills extend the agent’s capabilities and are distributed as npm packages. Install a skill with openclaw skill install <package-name> and it becomes available as a tool in the conversation.
About Musketeers Tech
Musketeers Tech is a software development company specialising in AI agent systems, generative AI applications, and digital transformation. We build production-grade AI assistants, RAG pipelines, and multi-agent workflows for businesses across the UK, US, and Canada.
Services: https://musketeerstech.com/services/ai-agent-development/ Generative AI: https://musketeerstech.com/services/generative-ai-application-services/ Portfolio: https://musketeerstech.com/portfolio/ Contact: https://musketeerstech.com/contact/
← Back