Post

AI-Powered Kubernetes Troubleshooting with K8sGPT

AI-Powered Kubernetes Troubleshooting with K8sGPT

Kubernetes clusters are complex, generating vast amounts of logs and events that are difficult to parse manually. k8sgpt empowers SREs and devops engineers by automating root cause analysis through AI, reducing mean-time-to-resolution (MTTR) considerably.

🎞️ Watch Video

Prerequisites:

  • Kubernetes cluster
  • Helm 3.x 0r 4.x
  • kubectl configured

Deploy Ollama LLM in Kubernetes

Create Namespace and Storage

1
kubectl create namespace ollama

Apply:

1
kubectl apply -f ollama/ollama-pvc.yaml

Deploy Ollama

1
2
3
4
kubectl apply -f ollama/ollama-deployment.yaml

# Wait for pod to be ready
kubectl wait --for=condition=ready pod -l app=ollama -n ollama --timeout=300s

Load Model into Ollama

1
2
3
4
5
6
7
8
# Get pod name
POD_NAME=$(kubectl get pod -n ollama -l app=ollama -o jsonpath='{.items[0].metadata.name}')

#To load a model into Ollama, simply use
kubectl exec -n ollama $POD_NAME -- ollama pull gemma3:4b

# Verify model is loaded
kubectl exec -n ollama $POD_NAME -- ollama list

Test Ollama Service

1
kubectl port-forward -n ollama svc/ollama 11434:11434

In another terminal, test the API

1
2
3
4
5
curl http://localhost:11434/api/generate -d '{
  "model": "gemma3:4b",
  "prompt": "Explain crewai in one sentence",
  "stream": false
}'

Install K8sGPT CLI

1
2
# Using curl
curl -sSfL https://raw.githubusercontent.com/k8sgpt-ai/k8sgpt/main/install.sh | sh
1
k8sgpt --help

Configure k8sgpt to use LocalAI using Ollama backend

1
k8sgpt auth add --backend localai --model mistral --baseurl http://localhost:11434/v1
1
k8sgpt auth default -p localai
1
k8sgpt auth list

Install a malicious Pod

1
kubectl run webapp --image=ngix:latest

Analyze your cluster by using k8s

1
2
k8sgpt analyze --explain --backend localai --namespace default --filter Pod
k8sgpt analyze --explain --backend localai --namespace default --filter Service

Deploy K8sGPT Operator

Install the kube-prometheus-stack Helm Chart to use Grafana and Prometheus.

1
2
3
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace --version 79.8.2 
1
kubectl get pods -n monitoring -w

Install K8sGPT Operator via Helm

1
2
3
4
# Add Helm repository
helm repo add k8sgpt https://charts.k8sgpt.ai/
helm repo update
helm install release k8sgpt/k8sgpt-operator -n k8sgpt-operator-system --create-namespace -f k8sgpt-values.yaml
1
2
# Verify installation
kubectl get pods -n k8sgpt-operator-system

Configure K8sGPT to Use Local Ollama

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# k8sgpt-config.yaml
cat <<EOF | kubectl apply -f -
apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
  name: k8sgpt-local
  namespace: k8sgpt-operator-system
spec:
  ai:
    enabled: true
    backend: localai
    model: mistral
    baseUrl: http://ollama.ollama.svc.cluster.local:11434/v1
  noCache: false
  repository: ghcr.io/k8sgpt-ai/k8sgpt
  version: v0.4.1
EOF

Apply configuration:

1
kubectl apply -f k8sgpt-config.yaml

Check K8sGPT status

1
2
kubectl get k8sgpt -n k8sgpt-operator-system
kubectl describe k8sgpt k8sgpt-local -n k8sgpt-operator-system

View K8sGPT Results

1
2
3
4
5
6
7
8
# Get results as custom resources
kubectl get results -n k8sgpt-operator-system

# View detailed analysis
kubectl get results -n k8sgpt-operator-system -o yaml

# Watch for new results
kubectl get results -n k8sgpt-operator-system -w
1
2
# Access prometheus
kubectl port-forward service/prometheus-kube-prometheus-prometheus -n monitoring 9090:9090
1
2
# Access Grafana
kubectl port-forward service/prometheus-grafana -n monitoring 3000:80
This post is licensed under CC BY 4.0 by the author.