OpenClaw (MoltBot) Setup Guide: Ubuntu + NVIDIA + Ollama + GLM-4.7 + Telegram

This guide explains how to deploy OpenClaw (formerly MoltBot) on bare-metal Ubuntu with an NVIDIA GPU, using Ollama as the local LLM server and GLM-4.7 as the language model, connected to Telegram for chat access. The result is a fully private, zero-cost AI assistant that runs on your own hardware.

Key Takeaways

OpenClaw was renamed from MoltBot in January 2026. The CLI command is openclaw; the moltbot alias still works.
The full setup requires Ubuntu 22.04/24.04, an NVIDIA GPU (8 GB+ VRAM), Node.js 20+, and a Telegram account.
Ollama manages model downloads, serving, and the OpenAI-compatible REST API that OpenClaw uses.
GLM-4.7-Flash (quantised) fits on 8–12 GB VRAM and generates 120–220 tokens/second on an RTX 4090.
A 64k context window is recommended for OpenClaw’s agent layer — configure this via a custom Ollama model.
The openclaw onboard wizard handles all configuration: LLM provider, Telegram bot token, and gateway startup.
Once running, all inference is local — no data leaves your machine, no API billing applies.

What Is OpenClaw?

OpenClaw is an open-source personal AI gateway that connects large language models to messaging platforms — Telegram, WhatsApp, Slack, Discord, Signal, iMessage, and others. It manages the gateway process, routes messages through a configurable agent layer, maintains conversation context, and handles channel authentication.

OpenClaw differs from a standard chatbot in three ways. First, it is a persistent gateway — it runs as a background service, not a one-off script. Second, it is channel-agnostic — the same local LLM backend serves multiple messaging platforms simultaneously. Third, it is tool-aware — the agent layer can call external tools and APIs on behalf of the user.

OpenClaw was previously known as MoltBot (and before that, ClawdBot). The January 2026 rename reflects the project’s maturity and expanded feature set. The software is identical; only the name and primary CLI command changed.

Full blog post: https://musketeerstech.com/blogs/openclaw-setup-ubuntu-nvidia-ollama-telegram/

System Requirements

Hardware minimums:

NVIDIA GPU with 8 GB VRAM minimum (RTX 3060 or better). 16–24 GB VRAM for the full GLM-4.7 model.
16 GB system RAM (32 GB recommended).
50 GB free disk space for OS, Ollama installation, and model weights.

Software requirements:

Ubuntu 22.04 LTS or 24.04 LTS.
Node.js 20 or later (required by the OpenClaw CLI).
NVIDIA drivers (version 550 or later recommended).
CUDA Toolkit (optional — Ollama ships its own CUDA libraries).

Accounts required:

Telegram account (for bot creation via BotFather).

GLM-4.7 vs GLM-4.7-Flash

GLM-4.7 is the full reasoning model. It requires 16–24 GB VRAM and produces the highest benchmark scores for coding and reasoning tasks. GLM-4.7-Flash is a quantised variant that fits on 8–12 GB VRAM with 4-bit quantisation. On an RTX 4090, Flash generates 120–220 tokens per second after warmup, with first-token latency of 250–400 ms.

For Telegram assistant workloads — short prompts, conversational replies, code snippets — GLM-4.7-Flash is the appropriate choice. The full GLM-4.7 model is better for long documents, complex reasoning chains, and batch tasks where quality matters more than speed.

Step 1: Install Ubuntu and NVIDIA Drivers

Install Ubuntu 22.04 LTS or 24.04 LTS on the target machine. After installation:

sudo apt update && sudo apt upgrade -y
sudo reboot
sudo ubuntu-drivers autoinstall
sudo reboot
nvidia-smi

The nvidia-smi command confirms the GPU is recognised. Output includes GPU model, driver version, and current VRAM usage. If nvidia-smi fails, run sudo ubuntu-drivers devices to list available drivers and install the recommended version manually (e.g., sudo apt install nvidia-driver-550).

CUDA Toolkit installation is optional for the OpenClaw + Ollama stack. Ollama ships its own CUDA libraries. Install the Toolkit only if you plan to run additional CUDA workloads alongside Ollama.

Step 2: Install Ollama and Pull GLM-4.7

Ollama is a local model server that handles model downloads, quantisation management, context window configuration, and an OpenAI-compatible REST API.

curl -fsSL https://ollama.com/install.sh | sh
ollama --version
systemctl status ollama

Pull the GLM-4.7-Flash model:

ollama pull glm-4.7-flash

This downloads approximately 5–6 GB of model weights. Confirm the model is working:

ollama run glm-4.7-flash "Say hello in one sentence."

Exit with /bye. If the model responds correctly, Ollama is functioning. Verify GPU acceleration by opening a second terminal and checking nvidia-smi while the model is running — VRAM usage should increase from near-zero to several gigabytes.

Configure a 64k Context Window

OpenClaw’s agent layer benefits from a large context window. Create a custom Ollama model with extended context:

ollama create glm-4.7-flash-ctx -f - <<EOF
FROM glm-4.7-flash
PARAMETER num_ctx 65536
EOF

Use glm-4.7-flash-ctx as the model identifier in the OpenClaw configuration. This allows OpenClaw to maintain longer conversation histories and handle multi-turn tool calls reliably.

Step 3: Install OpenClaw

The OpenClaw CLI is distributed as an npm package:

npm install -g openclaw
openclaw --version

The moltbot command alias is preserved for backward compatibility. Both openclaw and moltbot point to the same binary after installation.

If migrating from an existing MoltBot installation, run:

openclaw migrate

This copies the configuration from ~/.moltbot/config.yaml to ~/.openclaw/config.yaml while preserving all settings.

Step 4: Run the Onboarding Wizard

openclaw onboard

The wizard completes five configuration stages:

Gateway initialisation — Creates ~/.openclaw/config.yaml and registers the local device key.
LLM provider selection — Select “Ollama” from the provider list.
Ollama endpoint — Default is http://localhost:11434. Change this if Ollama runs on a remote host.
Model selection — Enter glm-4.7-flash-ctx (the custom context-extended model created in Step 2).
Channel setup — Select Telegram. The wizard prompts for a bot token.

Do not manually edit ~/.openclaw/config.yaml before running openclaw onboard at least once. The wizard registers the device key during initialisation; skipping it leaves the gateway unregistered.

Step 5: Create a Telegram Bot

Telegram bots are managed through the BotFather bot.

Open Telegram and search for @BotFather.
Send /newbot.
Provide a display name (e.g., “My OpenClaw Assistant”) and a username ending in bot (e.g., myopenclaw_bot).
Copy the API token returned by BotFather (format: 123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefgh).

Return to the openclaw onboard wizard and paste this token when prompted for the Telegram bot token.

Step 6: Connect Telegram and Start the Gateway

The wizard writes the Telegram configuration to ~/.openclaw/config.yaml:

channels:
  telegram:
    enabled: true
    botToken: "123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefgh"
    dmPolicy: whitelist
    whitelist:
      - YOUR_TELEGRAM_USER_ID

Find your Telegram user ID by messaging @userinfobot — it returns your numeric ID. Add this ID to the whitelist to restrict the bot to your account only.

Restart the gateway and verify status:

openclaw gateway restart
openclaw status

Successful output shows: gateway: running, provider: ollama, channels.telegram: connected.

Verification

Send a message to your Telegram bot. The first response may take 5–15 seconds (model warmup). Subsequent responses arrive within 1–3 seconds on typical hardware.

Test cases to verify the full pipeline:

Basic chat: Confirm the model responds coherently and does not reference any cloud API.
GPU utilisation: Check nvidia-smi during inference — VRAM usage should spike, confirming local GPU execution.
Context retention: Send follow-up questions referencing earlier messages — the 64k context window should maintain the conversation correctly.

Troubleshooting Reference

Gateway fails to start: Verify Node.js 20+ is installed (node --version), confirm Ollama is running (systemctl status ollama), and check logs (openclaw logs --tail 50).

Telegram bot receives no response: Confirm the bot token in ~/.openclaw/config.yaml is correct. Verify your Telegram user ID is in the whitelist if dmPolicy: whitelist is set. Test Ollama directly: curl http://localhost:11434/api/tags.

Ollama runs on CPU instead of GPU: Run nvidia-smi to confirm the GPU is visible. Check ollama logs for CUDA initialisation errors. Ensure Ollama is not running inside a container without GPU passthrough.

Slow responses: First responses after model load are always slow (warmup). If all responses are slow, check nvidia-smi dmon — if GPU utilisation stays below 30%, the model may be exceeding VRAM and spilling to system RAM. Switch to glm-4.7-flash if using the full model on a card with under 16 GB VRAM.

Cloud AI vs Local OpenClaw: Key Differences

Dimension	Cloud AI (ChatGPT/Claude)	OpenClaw + Ollama
Cost	$20–$200+/month at scale	Zero ongoing cost
Rate limits	Yes (API quotas)	None (GPU is the ceiling)
Data privacy	Prompts sent to vendor	Never leaves your machine
Offline capability	No	Yes (after model pull)
Model flexibility	Vendor-controlled	Any Ollama-compatible model
Setup complexity	Low	Medium (one-time)

Extending OpenClaw

OpenClaw supports additional channels beyond Telegram. WhatsApp, Slack, Discord, Google Chat, Signal, and iMessage are all configurable through the same ~/.openclaw/config.yaml file or via openclaw onboard. Each channel requires its own credentials.

OpenClaw integrates with the Model Context Protocol (MCP) for tool connectivity — external APIs, file systems, databases, and web search can all be connected as tools available to the agent layer. See the MCP documentation for configuration details.

Custom skills extend the agent’s capabilities and are distributed as npm packages. Install a skill with openclaw skill install <package-name> and it becomes available as a tool in the conversation.

About Musketeers Tech

Musketeers Tech is a software development company specialising in AI agent systems, generative AI applications, and digital transformation. We build production-grade AI assistants, RAG pipelines, and multi-agent workflows for businesses across the UK, US, and Canada.

Services: https://musketeerstech.com/services/ai-agent-development/ Generative AI: https://musketeerstech.com/services/generative-ai-application-services/ Portfolio: https://musketeerstech.com/portfolio/ Contact: https://musketeerstech.com/contact/

March 30, 2026 Musketeers Tech Musketeers Tech

← Back