Files
k3s-manifests/metrics/README.md

63 lines
3.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# metrics stack
Opinionated manifests for deploying kube-prometheus-stack (Prometheus Operator + Grafana) together with a VictoriaMetrics single-node database in the `metrics` namespace.
## Install / upgrade
```sh
kubectl apply -f metrics/namespace.yaml
# kube-prometheus-stack
target=sc prometheus-community
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace metrics \
--values metrics/kube-prometheus-stack-values.yaml
kubectl --namespace metrics get secret kube-prometheus-stack-grafana \
-o jsonpath="{.data.admin-password}" | base64 -d
echo
# expose grafana via Traefik
kubectl apply -f metrics/grafana-ingress.yaml
kubectl -n metrics get ingress grafana
# victoria metrics for long-term storage
helm repo add victoria-metrics https://victoriametrics.github.io/helm-charts
helm upgrade --install victoria-metrics-single victoria-metrics/victoria-metrics-single \
--namespace metrics \
--values metrics/victoria-metrics-single-values.yaml
# expose victoria metrics via ClusterIP for Prometheus/Grafana
kubectl apply -f metrics/victoria-metrics-service.yaml
```
The manifests default to the Yandex Managed Kubernetes dynamic storage class `yc-network-hdd`; tweak the `storageClassName`/`storageClass` fields and capacities if you prefer something else.
Before applying `metrics/grafana-ingress.yaml`, update the host (`grafana.playground.t01tt.tech`) and, if needed, change the `cert-manager.io/cluster-issuer` annotation to match your staging/production workflow. The ingress uses the `traefik` ingress class.
## Components
- **Prometheus Operator** provisions Prometheus, Alertmanager and related CRDs. Remote write targets VictoriaMetrics for durable retention.
- **Grafana** is pre-provisioned with persistence enabled and a secondary data source pointing at VictoriaMetrics.
- **VictoriaMetrics** stores metrics for long-term retention while also serving query traffic for Grafana. A dedicated ClusterIP service (`metrics/victoria-metrics-service.yaml`) exposes port 8428 for Prometheus remote write and Grafana queries.
## Database choices
Prometheus ships with an embedded TSDB. For longer retention, clustering or multi-tenant needs you can offload data to:
- **VictoriaMetrics** (single, clustered, or managed) cost-efficient, Prometheus-compatible, supports multi-year retention.
- **Thanos / Cortex / Grafana Mimir** horizontally scalable object-storage backed TSDBs with multi-cluster federation.
- **ClickHouse / TimescaleDB / PostgreSQL** SQL stores for advanced analytics (requires Promscale or similar adapter).
- **Graphite / InfluxDB** legacy or streaming-friendly stores; integrate via remote write adapters.
Pick the backend that matches your retention and query latency requirements. Remote write configuration lives under `prometheus.prometheusSpec.remoteWrite` in `kube-prometheus-stack-values.yaml`.
## Post-install checks
```sh
kubectl -n metrics get pods
kubectl -n metrics get svc
kubectl -n metrics get prometheus,prometheusrules,servicemonitors -A
```