Inside OpenClaw

Inside OpenClaw #5: Production-Ready AI on WSL2

WSL2 is a development tool — not a production platform. How we hardened it for 24/7 operation of a local AI agent: vmIdleTimeout, systemd, keepalive, and the pitfalls no documentation mentions.

Your inference server runs fine on Friday afternoon. You set the right vLLM flags, tool calling works, the agent responds correctly. Monday morning, the server is dead. No crash log. No error message. WSL2 simply stopped running.

This is the gap between “works on my machine” and “runs in production.” And it has nothing to do with code. Production stability of AI systems is an aspect of my AI and automation consulting.

WSL2 Was Never Meant for This

Microsoft built WSL2 as a development environment — a way for programmers to run Linux tools on Windows during the workday. It was not designed for an inference server that needs to be reachable around the clock.

Out of the box, three problems make WSL2 unsuitable for production AI workloads:

  • Idle shutdown: The VM shuts down after perceived inactivity, killing all services
  • Uncontrolled memory: Without explicit limits, WSL2 consumes all available RAM or gets killed by the host
  • Network instability: NAT networking causes DNS failures and dropped connections

In the OpenClaw series, we described how our AI agent runs locally on a single GPU. This article covers what it took to keep that system running continuously without manual intervention.

vmIdleTimeout: The Silent Killer

WSL2 monitors process activity. When the hypervisor decides nothing meaningful is happening — for example, because your inference server is idle, waiting for the next request — it can shut down the VM entirely. No warning.

For a service that must stay available, this is the first thing to fix:

# %USERPROFILE%\.wslconfig
[wsl2]
vmIdleTimeout=-1

Setting vmIdleTimeout=-1 disables the idle timeout completely. The VM stays running until you explicitly stop it or shut down the host machine. Without this single line, your inference server dies the moment it has a few quiet minutes.

Controlling Memory Explicitly

By default, WSL2 can claim up to 50% of the host’s RAM. On a 64 GB system running an RTX 4090 and a 24B model, that’s not enough headroom for both Windows and the AI stack:

  • vLLM needs system RAM for tokenizer operations, scheduling, and KV-cache overflow
  • The Gateway process manages sessions, tool calls, and conversation context
  • Linux itself needs memory for systemd, networking, and monitoring

Our production .wslconfig:

[wsl2]
vmIdleTimeout=-1
memory=24GB
swap=8GB
processors=8
  • memory=24GB — Hard cap. Prevents WSL2 from starving Windows and avoids OOM kills from the host.
  • swap=8GB — Absorbs load spikes without process termination.
  • processors=8 — On a 16-core machine, leaves the other half for Windows.

systemd: Real Service Management

WSL2 has supported systemd natively since 2022. This means you can run vLLM and the OpenClaw Gateway as proper Linux services with automatic restart on failure:

# /etc/systemd/system/vllm.service
[Unit]
Description=vLLM Inference Server
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
ExecStart=/opt/vllm/start.sh
Restart=always
RestartSec=10
Environment=CUDA_VISIBLE_DEVICES=0

[Install]
WantedBy=multi-user.target

Key details:

  • Restart=always brings the service back after any failure, including OOM kills
  • After=network-online.target waits for a working network connection before starting
  • RestartSec=10 gives the GPU ten seconds to release VRAM before the restart

Enable systemd in wsl.conf:

# /etc/wsl.conf
[boot]
systemd=true

Fixing the Network

WSL2 defaults to NAT networking. The Linux VM gets an internal IP, and Windows handles port forwarding. In practice, this causes two problems:

  • DNS failures: The auto-generated /etc/resolv.conf points to a Windows DNS proxy that intermittently stops responding
  • Dropped connections: Long-lived HTTP connections to the inference server get killed by NAT timeouts

Our DNS fix:

# /etc/wsl.conf
[network]
generateResolvConf=false
# /etc/resolv.conf
nameserver 8.8.8.8
nameserver 1.1.1.1

For reliable external access, we configure explicit port forwarding via netsh rather than relying on WSL2’s automatic forwarding.

Keepalive: Active Monitoring

Even with vmIdleTimeout=-1, we run an active keepalive. A systemd timer checks every 60 seconds whether the inference server responds — and restarts it if it doesn’t:

#!/bin/bash
# /opt/openclaw/keepalive.sh
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
  --max-time 10 http://localhost:8000/health)

if [ "$RESPONSE" != "200" ]; then
  systemctl restart vllm
  logger "OpenClaw Keepalive: vLLM restarted (HTTP $RESPONSE)"
fi

In three months of operation, this script caught seven silent failures that would have gone unnoticed until someone tried to use the system. Without monitoring, there is no production stability.

Backup and Recovery

WSL2 distributions can be exported as tar archives:

wsl --export Ubuntu-OpenClaw D:\Backups\openclaw-backup.tar

After a total failure, a working copy can be restored in under five minutes. Configuration files — .wslconfig, systemd units, scripts — are additionally tracked in Git.

What It Comes Down To

The difference between a demo and a production system on WSL2 is not code. It is configuration. Six files determine whether your local AI agent survives the weekend:

  • .wslconfig — Idle timeout, RAM, swap, CPU limits
  • /etc/wsl.conf — systemd activation, DNS control
  • systemd units — Service management with auto-restart
  • Keepalive script — Active health monitoring
  • Backup routine — Recovery without rebuilding

WSL2 makes local AI on Windows possible without dual-boot setups or dedicated server hardware. But production stability requires deliberate hardening — and the documentation won’t tell you that.


Next Step

Running local AI infrastructure and need it to stay up? I’ve solved these problems in practice and will help you move from prototype to production.

Book a free consultation

→ Or read more first: Local LLM as AI Agent — Without Cloud

About the Author René Pfisterer

10+ years in ERP integration, data migration, and process automation for mid-sized companies. Specialized in DATEV, SAP, and AI implementation.

Full profile →
← Previous article Data Quality in Mid-Sized Companies: Why Projects Fail in Preparation, Not Execution Next article → ERP Frustration in Mid-Sized Companies: Why a New System Won't Fix the Old Problem

Interested?

Let's discuss how I can help in a short conversation.