Files
bakery-ia/infrastructure/monitoring/k8s-infra/README.md

122 lines
2.9 KiB
Markdown
Raw Normal View History

2026-01-24 20:14:19 +01:00
# Kubernetes Infrastructure Monitoring
This directory contains configurations for deploying Kubernetes infrastructure monitoring components that integrate with SigNoz.
## Components
| Component | Purpose | Metrics Endpoint |
|-----------|---------|------------------|
| **kube-state-metrics** | Kubernetes object metrics (pods, deployments, nodes, etc.) | `:8080/metrics` |
| **node-exporter** | Host-level metrics (CPU, memory, disk, network) | `:9100/metrics` |
## Quick Start (MicroK8s Production)
```bash
# 1. Deploy infrastructure monitoring components
./deploy-k8s-infra-monitoring.sh --microk8s install
# 2. Upgrade SigNoz to scrape the new metrics
microk8s helm3 upgrade signoz signoz/signoz \
-n bakery-ia \
-f ../signoz/signoz-values-prod.yaml
```
## Usage
### Install
```bash
# Standard Kubernetes
./deploy-k8s-infra-monitoring.sh install
# MicroK8s
./deploy-k8s-infra-monitoring.sh --microk8s install
```
### Upgrade
```bash
./deploy-k8s-infra-monitoring.sh --microk8s upgrade
```
### Uninstall
```bash
./deploy-k8s-infra-monitoring.sh --microk8s uninstall
```
### Check Status
```bash
./deploy-k8s-infra-monitoring.sh --microk8s status
```
### Dry Run
```bash
./deploy-k8s-infra-monitoring.sh --microk8s --dry-run install
```
## Files
- `kube-state-metrics-values.yaml` - Helm values for kube-state-metrics
- `node-exporter-values.yaml` - Helm values for node-exporter
- `deploy-k8s-infra-monitoring.sh` - Deployment automation script
## SigNoz Integration
The SigNoz OTel Collector is configured (in `signoz-values-prod.yaml`) to scrape metrics from:
- `kube-state-metrics.bakery-ia.svc.cluster.local:8080`
- `node-exporter-prometheus-node-exporter.bakery-ia.svc.cluster.local:9100`
After deploying these components, metrics will appear in SigNoz under:
- **Infrastructure** > **Kubernetes** (for K8s object metrics)
- **Infrastructure** > **Hosts** (for node metrics)
## Metrics Available
### From kube-state-metrics
- Pod status, phase, restarts
- Deployment replicas (desired vs available)
- Node conditions and capacity
- PVC status and capacity
- Resource requests and limits
- Job/CronJob status
### From node-exporter
- CPU usage per core
- Memory usage (total, free, cached)
- Disk I/O and space
- Network traffic (bytes in/out)
- System load average
- Filesystem usage
## Troubleshooting
### Check if metrics are being scraped
```bash
# Port-forward to kube-state-metrics
microk8s kubectl port-forward svc/kube-state-metrics 8080:8080 -n bakery-ia &
curl localhost:8080/metrics | head -50
# Port-forward to node-exporter
microk8s kubectl port-forward svc/node-exporter-prometheus-node-exporter 9100:9100 -n bakery-ia &
curl localhost:9100/metrics | head -50
```
### Check OTel Collector logs
```bash
microk8s kubectl logs -l app.kubernetes.io/name=signoz-otel-collector -n bakery-ia --tail=100
```
### Verify pods are running
```bash
microk8s kubectl get pods -n bakery-ia | grep -E "(kube-state|node-exporter)"
```