KIND Deployment Guide
Overview
KIND (Kubernetes IN Docker) is used for local development and CI integration testing. The project provides a full bootstrap flow: create a KIND cluster, install Calico CNI for NetworkPolicy enforcement, build the container image, load it into KIND nodes, and deploy via Helm.
Bootstrap Flow
The bootstrap process follows five sequential steps from cluster creation to Helm deployment.
Bootstrap Flow
Five steps from cluster creation to deployment.
Calico CNI
KIND ships with a basic CNI (kindnet) that does not enforce NetworkPolicy resources. Calico replaces it with a full CNI that provides NetworkPolicy enforcement, ensuring the chart's NetworkPolicy templates actually take effect during integration tests.
The install-calico.sh script handles the full installation:
- Installs tigera-operator -- Deploys the Calico operator. Version configurable via
CALICO_VERSIONenv (default: 3.31.4). - Waits for CRDs -- The operator creates CRDs asynchronously. The script polls for
installations.operator.tigera.iobefore applying custom resources. - Fixes Reverse Path Filtering -- KIND nodes use loose RPF (
rp_filter=2). Calico's Felix rejects this by default. SettingFELIX_IGNORELOOSERPF=truetells Felix to accept the loose setting. - Restarts CoreDNS -- CoreDNS pods scheduled before Calico have stale network config from kindnet. Restarting ensures fresh Calico network interfaces for correct DNS with NetworkPolicy enforcement.
Container Startup Flow
The container entrypoint (entrypoint.sh) is invoked by tini (PID 1) and handles mode validation, authentication, skill staging, and mode dispatch.
Container Startup Flow
Entrypoint decision tree from tini through mode dispatch.
Health Probes
The chart configures two Kubernetes probes with distinct purposes and performance characteristics:
Liveness Probe (healthcheck.sh)
- Check:
pgrep -f "claude"-- Is the Claude Code process running? - Weight: Lightweight -- single pgrep call
- Failure action: Container restarted by kubelet
- CLAUDE_TEST_MODE bypass: Returns 0 immediately (CI pods run
sleep infinity, not Claude)
Readiness Probe (readiness.sh)
- Check:
claude auth status-- Is Claude authenticated and ready? - Weight: Heavier -- runs
claude auth statussubprocess - Failure action: Pod removed from Service endpoints (not restarted)
- CLAUDE_TEST_MODE bypass: Returns 0 immediately
- Why periodSeconds=30: Shorter intervals would overlap auth-check subprocesses and spike resource usage
Integration Testing
The project uses BATS (Bash Automated Testing System) for integration testing in KIND:
- Test framework: BATS --
setup-bats.shfor local install,apt-get install batsin CI - Test suites: RBAC permissions (01-rbac.bats), networking (02-networking.bats), tool verification (03-tools.bats), persistence (04-persistence.bats)
- Auth bypass:
CLAUDE_TEST_MODE=true-- entrypoint runssleep infinityinstead of Claude, probes return 0, enabling auth-less testing of infrastructure
Quick Start
Bootstrap the cluster, then attach and run /login to authenticate via browser OAuth. Credentials are saved to the PVC and persist across pod restarts — only needed once.