KIND Deployment Guide

Overview

KIND (Kubernetes IN Docker) is used for local development and CI integration testing. The project provides a full bootstrap flow: create a KIND cluster, install Calico CNI for NetworkPolicy enforcement, build the container image, load it into KIND nodes, and deploy via Helm.

Bootstrap Flow

The bootstrap process follows five sequential steps from cluster creation to Helm deployment.

Bootstrap Flow

Five steps from cluster creation to deployment.

Calico CNI

KIND ships with a basic CNI (kindnet) that does not enforce NetworkPolicy resources. Calico replaces it with a full CNI that provides NetworkPolicy enforcement, ensuring the chart's NetworkPolicy templates actually take effect during integration tests.

The install-calico.sh script handles the full installation:

Installs tigera-operator -- Deploys the Calico operator. Version configurable via CALICO_VERSION env (default: 3.31.4).
Waits for CRDs -- The operator creates CRDs asynchronously. The script polls for installations.operator.tigera.io before applying custom resources.
Fixes Reverse Path Filtering -- KIND nodes use loose RPF (rp_filter=2). Calico's Felix rejects this by default. Setting FELIX_IGNORELOOSERPF=true tells Felix to accept the loose setting.
Restarts CoreDNS -- CoreDNS pods scheduled before Calico have stale network config from kindnet. Restarting ensures fresh Calico network interfaces for correct DNS with NetworkPolicy enforcement.

Container Startup Flow

The container entrypoint (entrypoint.sh) is invoked by tini (PID 1) and handles mode validation, authentication, skill staging, and mode dispatch.

Container Startup Flow

Entrypoint decision tree from tini through mode dispatch.

Health Probes

The chart configures two Kubernetes probes with distinct purposes and performance characteristics:

Liveness Probe (healthcheck.sh)

Check: pgrep -f "claude" -- Is the Claude Code process running?
Weight: Lightweight -- single pgrep call
Failure action: Container restarted by kubelet
CLAUDE_TEST_MODE bypass: Returns 0 immediately (CI pods run sleep infinity, not Claude)

Readiness Probe (readiness.sh)

Check: claude auth status -- Is Claude authenticated and ready?
Weight: Heavier -- runs claude auth status subprocess
Failure action: Pod removed from Service endpoints (not restarted)
CLAUDE_TEST_MODE bypass: Returns 0 immediately
Why periodSeconds=30: Shorter intervals would overlap auth-check subprocesses and spike resource usage

Integration Testing

The project uses BATS (Bash Automated Testing System) for integration testing in KIND:

Test framework: BATS -- setup-bats.sh for local install, apt-get install bats in CI
Test suites: RBAC permissions (01-rbac.bats), networking (02-networking.bats), tool verification (03-tools.bats), persistence (04-persistence.bats)
Auth bypass: CLAUDE_TEST_MODE=true -- entrypoint runs sleep infinity instead of Claude, probes return 0, enabling auth-less testing of infrastructure

Quick Start

Bootstrap the cluster, then attach and run /login to authenticate via browser OAuth. Credentials are saved to the PVC and persist across pod restarts — only needed once.

bootstrap

$ git clone https://github.com/PatrykQuantumNomad/claude-in-a-box.git

$ cd claude-in-a-box

$ make bootstrap

$ kubectl attach claude-agent-0 -it

$ # Inside Claude Code: /login