Skip to Content
Referencevmkit-agent Reference

vmkit-agent Reference

vmkit-agent is a Go binary that runs as a systemd daemon on every VM provisioned by VMKit. It maintains a persistent outbound WebSocket connection to the VMKit control plane and executes RPC commands sent down that connection.

The agent is the exclusive channel between the VMKit backend and your VM. The backend never opens inbound connections to the VM’s IP.


Installation

Automatic (standard path). Cloud-init runs on first boot and handles the full installation: Docker CE, the vmkit-agent binary, and the systemd unit. No manual step is required.

Manual (advanced). See Manual Installation below.


WebSocket Protocol

PropertyValue
Endpointwss://gateway.vmkit.dev/api/daemon
AuthAuthorization: Bearer <agent-token> header on the initial upgrade request
Token sourceWritten to /etc/vmkit/agent.env by cloud-init at provision time
DirectionPull model — agent calls out; backend never initiates inbound connections
ReconnectExponential backoff with jitter; indefinite retry
HeartbeatAgent sends a ping frame every 30 seconds; backend expects a pong within 10 seconds

The agent token is a workspace-scoped secret generated by the backend at provision time. It is distinct from user vmk_ API keys and cannot be used against the REST API.


Daemon Handlers

These are the RPC commands the backend can send over the WebSocket connection.

HandlerTriggerInputsOutputs
registerAgent startupversion: string, instance_id: string, ip: string, os: string, arch: string{ "ok": true }
logs_fetchDashboard log view, MCP get_logs, GET /api/deploys/{id}/logscontainer: string, lines: integer (default 100, max 5000){ "lines": [{ "ts": "<RFC3339>", "text": "<string>" }] }
metrics_collectBackend polling (every 60 s)(none)See Health Reporting
agent_upgradeBackend-initiated upgradeversion: string, download_url: string, checksum_sha256: string{ "status": "upgrading" } — agent restarts; backend detects reconnect

Handler details

register — sent automatically by the agent on every WebSocket connection (initial and reconnects). The backend uses this to mark the instance as agent_connected: true and record the agent version.

logs_fetch — the agent calls docker logs --tail <lines> <container> against the Docker daemon on the VM and streams the output back as newline-delimited JSON. Requires the Docker socket to be accessible to the vmkit-agent process (configured during hardening).

metrics_collect — the agent reads /proc/meminfo, /proc/stat, df, and docker ps to collect the metrics payload. See Health Reporting.

agent_upgrade — the agent downloads the binary from download_url, verifies SHA-256 against checksum_sha256, atomically replaces /usr/local/bin/vmkit-agent, and issues systemctl restart vmkit-agent. The backend waits for the reconnect register call to confirm the upgrade succeeded.


Health Reporting

The agent pushes a health snapshot to the backend after every metrics_collect call. The backend stores the latest snapshot per instance and exposes it at GET /api/instances/{id}.

MetricSourceUnits
cpu_pct/proc/stat (1-second sample)Percentage (0–100)
mem_pct/proc/meminfo (MemUsed / MemTotal)Percentage (0–100)
disk_pctdf / (used / total)Percentage (0–100)
container_countdocker ps -q | wc -lInteger
container_healthdocker inspect --format '{{.State.Health.Status}}' for each containerMap of container_name → "healthy" | "unhealthy" | "starting" | "none"

A missing heartbeat (no pong response within 10 seconds) marks the instance as agent_connected: false in the backend and triggers a dashboard warning. The next successful reconnect clears the warning.


Self-Upgrade

  1. Backend calls agent_upgrade with version, download_url, and checksum_sha256.
  2. Agent downloads the binary to /tmp/vmkit-agent-<version>.
  3. Agent verifies SHA-256. Aborts and returns an error if the checksum does not match.
  4. Agent copies the new binary to /usr/local/bin/vmkit-agent (atomic rename).
  5. Agent calls systemctl restart vmkit-agent (systemd restarts it as a new process).
  6. The new process starts, connects to the WebSocket, and sends register with the new version.
  7. Backend records the updated agent_version on the instance.

The old process is terminated by systemd as part of the restart. There is a brief (typically < 5 s) window where the agent is not connected.


Manual Installation

Use this if you are provisioning a VM outside the normal VMKit flow, or recovering a VM whose cloud-init failed.

cloud-init snippet

#cloud-config packages: - docker.io - curl runcmd: - mkdir -p /etc/vmkit - echo "VMKIT_AGENT_TOKEN=<your-agent-token>" > /etc/vmkit/agent.env - echo "VMKIT_GATEWAY_URL=wss://gateway.vmkit.dev/api/daemon" >> /etc/vmkit/agent.env - curl -fsSL https://github.com/vmkit/vmkit-agent/releases/latest/download/vmkit-agent-linux-amd64 \ -o /usr/local/bin/vmkit-agent - chmod +x /usr/local/bin/vmkit-agent - systemctl enable vmkit-agent - systemctl start vmkit-agent

Replace <your-agent-token> with the token from your environment’s compute target configuration. Contact support if you need to retrieve this token for an existing environment.

systemd service unit

/etc/systemd/system/vmkit-agent.service:

[Unit] Description=vmkit-agent daemon After=network-online.target docker.service Wants=network-online.target Requires=docker.service [Service] Type=simple EnvironmentFile=/etc/vmkit/agent.env ExecStart=/usr/local/bin/vmkit-agent Restart=always RestartSec=5s StandardOutput=journal StandardError=journal SyslogIdentifier=vmkit-agent [Install] WantedBy=multi-user.target

After placing this file, run:

systemctl daemon-reload systemctl enable vmkit-agent systemctl start vmkit-agent

Logs

Agent logs are written to the systemd journal under the vmkit-agent identifier.

# Follow live journalctl -u vmkit-agent -f # Last 200 lines journalctl -u vmkit-agent -n 200 # Since last boot journalctl -u vmkit-agent -b

Log level is INFO by default. Set VMKIT_LOG_LEVEL=debug in /etc/vmkit/agent.env and restart the service for verbose output.


Troubleshooting

Agent not connecting

Symptom: Dashboard shows agent_connected: false for the instance. GET /api/instances/{id} returns "agent_connected": false.

Checklist:

  1. Verify outbound port 443 is allowed. The agent only needs outbound TCP 443 — no inbound ports.
  2. Check the agent token is correct: grep VMKIT_AGENT_TOKEN /etc/vmkit/agent.env.
  3. Check agent logs: journalctl -u vmkit-agent -n 50.
  4. Confirm the service is running: systemctl status vmkit-agent.
  5. Test DNS resolution from the VM: curl -I https://gateway.vmkit.dev/health.

The VMKit control plane does not need to reach your VM’s IP. If you can confirm outbound 443 is open from the VM, connectivity issues are almost always a bad agent token or a misconfigured VMKIT_GATEWAY_URL.

Agent not reporting metrics

Symptom: metrics in GET /api/instances/{id} is null or stale (> 5 minutes old).

Checklist:

  1. Verify the vmkit-agent process user has access to the Docker socket:
    ls -la /var/run/docker.sock # Should show docker group ownership; vmkit-agent must be in the docker group
  2. If the socket permissions are wrong, re-run hardening by calling POST /api/instances/{id}/harden from the dashboard, or calling the harden_vm MCP tool with the instance ID.
  3. Check agent logs for permission denied on /var/run/docker.sock.

Agent upgrade stuck

Symptom: agent_version in GET /api/instances/{id} shows the old version after an upgrade was issued.

Checklist:

  1. Check agent logs around the time of the upgrade: journalctl -u vmkit-agent --since "10 minutes ago".
  2. Look for checksum mismatch errors — this aborts the upgrade and leaves the old binary in place.
  3. If systemd failed to restart: systemctl status vmkit-agent and journalctl -u vmkit-agent -n 20.
  4. To force-upgrade manually: download the binary, verify the checksum, and restart the service.
Last updated on