Getting Started with kubeqa: A Step-by-Step Tutorial for Kubernetes QA
A hands-on tutorial for getting started with kubeqa - install the CLI, run your first cluster health scan, set up chaos experiments, configure compliance auditing, and add deployment gates to your CI/CD pipeline.
This tutorial walks you through installing kubeqa and using all four of its core modules: cluster health scanning, chaos engineering, compliance auditing, and deployment gates. By the end, you will have a complete Kubernetes QA workflow running against your cluster.
The entire tutorial takes about 20 minutes. You need a running Kubernetes cluster (minikube, kind, or any cloud provider) and kubectl configured to access it.
Step 1: Install kubeqa
kubeqa is distributed as a single Go binary. No cluster-side agents, no Helm charts, no CRDs to install. It talks to the Kubernetes API using your existing kubeconfig.
macOS (Homebrew):
brew install nomadx-ae/tap/kubeqa
Linux:
curl -sSL https://get.kubernetes.qa | bash
From source:
go install github.com/nomadx-ae/kubeqa@latest
Verify the installation:
$ kubeqa version
kubeqa v0.9.0 (go1.22, linux/amd64)
kubeqa uses your default kubeconfig (~/.kube/config) automatically. If you use a non-default path, set the KUBECONFIG environment variable or pass --kubeconfig to any command.
$ kubeqa cluster info
Cluster: minikube
Server: https://192.168.49.2:8443
Version: v1.30.0
Nodes: 1
Pods: 22 (across 4 namespaces)
Access: read-only verified ✓
kubeqa only needs read access to your cluster. It never creates, modifies, or deletes resources during health scans or compliance audits. The chaos module does require write access to the target namespace, but only when you explicitly run chaos experiments.
Step 2: Run your first health scan
The cluster health scanner evaluates your cluster across eight dimensions and produces an overall health score.
$ kubeqa health scan
That is the entire command. kubeqa scans every namespace (excluding kube-system by default) and generates a report:
Cluster Health Score: 72/100
Dimension Score Findings
─────────────────────────────────────
Resources 65/100 8 pods without resource limits
Security 78/100 2 privileged containers, 3 root users
Networking 85/100 1 service without network policy
Storage 92/100 1 PVC near capacity (87%)
Availability 58/100 4 single-replica deployments
Observability 70/100 6 pods without readiness probes
Configuration 80/100 3 configmaps with large payloads
Cost Efficiency 88/100 2 over-provisioned deployments
Top 5 Recommendations:
1. Add replicas to single-replica deployments in production
2. Define resource limits for all workloads
3. Add readiness probes to pods missing them
4. Remove privileged security contexts
5. Add network policies to exposed services
Understanding the health score
Each dimension is scored from 0 to 100. The overall score is a weighted average, with security and availability weighted higher than cost efficiency and configuration. You can customize the weights in your kubeqa config file.
The scoring is deterministic - the same cluster state always produces the same score. This makes the health score useful as a tracked metric over time.
Targeting specific namespaces
For large clusters, you may want to scan specific namespaces:
# Scan only production
$ kubeqa health scan --namespace production
# Scan multiple namespaces
$ kubeqa health scan --namespace production,staging
# Include system namespaces
$ kubeqa health scan --include-system
Exporting results
kubeqa supports multiple output formats for integration with dashboards and alerting:
# JSON output for programmatic consumption
$ kubeqa health scan --output json > health-report.json
# Markdown for documentation
$ kubeqa health scan --output markdown > health-report.md
# JUnit XML for CI/CD integration
$ kubeqa health scan --output junit > health-report.xml
Step 3: Set up a chaos experiment
Now that you know your cluster’s health baseline, test its resilience by injecting controlled failures.
Create a test workload
If you do not have a production workload to test against, deploy a simple test application:
$ kubectl create namespace chaos-test
$ kubectl apply -n chaos-test -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: demo-app
spec:
replicas: 3
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
containers:
- name: app
image: nginx:1.27
ports:
- containerPort: 80
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
resources:
limits:
cpu: 100m
memory: 128Mi
EOF
Run a pod failure experiment
$ kubeqa chaos run pod-failure \
--namespace chaos-test \
--deployment demo-app \
--count 1 \
--duration 30s
[1/5] Validating steady state...
✓ demo-app: 3/3 pods ready
[2/5] Injecting failure: killing 1 pod...
✓ Pod demo-app-6d8b9 terminated
[3/5] Observing impact (30s window)...
- Available replicas: 2/3 → 3/3 (recovered)
[4/5] Verifying recovery...
✓ Replacement pod ready in 7.3s
✓ Steady state restored
[5/5] Resilience score: 5/5
Experiment configuration files
For repeatable experiments, define them in YAML:
# .kubeqa/chaos-experiments.yaml
experiments:
- name: api-pod-failure
type: pod-failure
target:
namespace: production
deployment: api-gateway
params:
count: 1
duration: 60s
safety:
abort-on: error-rate>25%
max-duration: 120s
- name: db-network-partition
type: network-partition
target:
namespace: production
source: api-gateway
destination: postgres
params:
duration: 30s
safety:
abort-on: error-rate>10%
Run all experiments in the suite:
$ kubeqa chaos suite run --config .kubeqa/chaos-experiments.yaml
Step 4: Configure compliance auditing
kubeqa ships with built-in compliance profiles for the most common frameworks. No external policy files to download or maintain.
Run a CIS Benchmark audit
$ kubeqa compliance audit --framework cis-1.8
CIS Kubernetes Benchmark v1.8
─────────────────────────────
Pass: 118/142 controls (83.1%)
Fail: 14 controls
Skip: 10 controls (not applicable)
Critical Failures:
✗ 1.2.5 Ensure --kubelet-certificate-authority is set
✗ 4.2.1 Minimize access to secrets
✗ 5.2.2 Minimize admission of privileged containers
High Failures:
✗ 5.2.6 Minimize admission of root containers
✗ 5.7.4 Default namespace should not be used
...
Available frameworks
$ kubeqa compliance list-frameworks
Framework Controls Description
──────────────────────────────────────────
cis-1.8 142 CIS Kubernetes Benchmark v1.8
cis-1.7 138 CIS Kubernetes Benchmark v1.7
nsa-cisa 68 NSA/CISA Kubernetes Hardening Guide
soc2 45 SOC 2 Type II (mapped to K8s controls)
hipaa 38 HIPAA Security Rule (mapped to K8s)
pci-dss-4.0 52 PCI DSS v4.0 (mapped to K8s)
nesa 34 UAE NESA Information Assurance
nca 29 Saudi NCA Essential Controls
Continuous compliance monitoring
Set up a recurring compliance check that alerts on drift:
# Run audit and compare against last baseline
$ kubeqa compliance audit --framework cis-1.8 --diff-baseline
Compared against baseline from 2026-02-28:
✓ 2 previously failing controls now pass
✗ 1 previously passing control now fails:
5.2.2 - New privileged container detected in namespace staging
Compliance trend: improving (83.1% → 84.5%)
Export compliance evidence for auditors:
$ kubeqa compliance report --framework soc2 --output pdf --evidence
Report saved: soc2-evidence-2026-03-15.pdf
Step 5: Add deployment gates to CI/CD
The final piece is preventing new problems from entering the cluster. Deployment gates validate your manifests before they are applied.
Local validation
Start by running gates locally against your manifest files:
$ kubeqa gate run k8s/manifests/ --fail-on high
Scanning 8 manifests...
CRITICAL deployment/api: no resource limits
HIGH deployment/worker: image tag is 'latest'
WARNING deployment/frontend: no PDB defined
INFO service/api: annotation missing
Gate: FAILED (1 critical, 1 high)
Create a gate configuration
Define your organization’s standards in a gate config file:
# .kubeqa/gate-config.yaml
gate:
fail-on: high
scan-images: true
compliance: cis-1.8
policies:
- name: require-limits
severity: critical
match:
kind: Deployment
namespace: production
check:
containers:
- resources.limits.cpu: required
- resources.limits.memory: required
- name: pin-image-tags
severity: high
match:
kind: [Deployment, StatefulSet]
check:
containers:
- image: "!*:latest"
- name: require-probes
severity: warning
match:
kind: Deployment
check:
containers:
- readinessProbe: required
- livenessProbe: required
GitHub Actions integration
# .github/workflows/deploy.yaml
name: Deploy
on:
push:
branches: [main]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: kubeqa gate
uses: nomadx-ae/kubeqa-action@v1
with:
command: gate run
manifests: k8s/
config: .kubeqa/gate-config.yaml
deploy:
needs: validate
runs-on: ubuntu-latest
steps:
- name: Deploy to production
run: kubectl apply -f k8s/ --namespace production
The gate runs in under 30 seconds for most repositories. Image scanning adds 1-2 minutes depending on the number of unique images.
Step 6: Create a unified kubeqa configuration
Tie everything together with a single configuration file that defines your complete Kubernetes QA strategy:
# .kubeqa/config.yaml
cluster:
name: production
namespaces:
include: [production, staging]
exclude: [kube-system, kube-public]
health:
schedule: "0 */6 * * *" # Every 6 hours
alert-on: score<70
weights:
security: 2.0
availability: 1.5
resources: 1.0
chaos:
experiments: .kubeqa/chaos-experiments.yaml
schedule: weekly
safety:
max-concurrent: 1
production-approval: true
compliance:
frameworks: [cis-1.8, soc2]
schedule: daily
baseline-drift: true
alert-on: regression
gate:
fail-on: high
scan-images: true
compliance: cis-1.8
policies: .kubeqa/policies/
Run everything at once:
$ kubeqa run --config .kubeqa/config.yaml
── Health Scan ──────────────────
Cluster score: 82/100 (↑ from 72)
── Compliance Audit ─────────────
CIS 1.8: 126/142 pass (88.7%)
SOC 2: 42/45 pass (93.3%)
── Deployment Gate ──────────────
8 manifests validated, 0 failures
Overall: PASS
What to do next
You now have a complete Kubernetes QA workflow. Here are the recommended next steps:
- Run
kubeqa health scanon your production cluster and fix the top five findings - Add
kubeqa gate runto your CI/CD pipeline in audit mode first, then enable blocking - Schedule a monthly game day with
kubeqa chaos suite runto test resilience - Set up daily compliance audits with drift detection for your target framework
- Track your scores over time to measure improvement
kubeqa is free and open source under the Apache 2.0 license. The CLI will always be free. For multi-cluster dashboards, team collaboration, and enterprise features, check out kubeqa Cloud.
Install kubeqa now with brew install nomadx-ae/tap/kubeqa and run your first scan in under five minutes. Star the project on GitHub, join the kubeqa Discord, and help us build the standard for Kubernetes quality assurance.
Ship Kubernetes with Confidence
Free for open-source use. No credit card required. Install kubeqa and run your first cluster scan in under 5 minutes.
Get Started Free