SPIRL Agent Metrics
This guide covers metrics collection, configuration, and monitoring for SPIRL Agents. Agents expose Prometheus-compatible metrics for workload identity delivery, control plane connectivity, and resource utilization.
Enabling Metrics​
Agents expose metrics on a configurable listen address (default: :9090). Metrics are disabled by default to reduce overhead when not used.
- Helm Installation
- Linux Installation
Enable metrics in your Helm values file:
telemetry:
enabled: true
collectors:
grpc:
emmitLatencyMetrics: false # Keep disabled unless debugging
metricsAPI:
listenAddr: ":9090"
health:
listenAddr: ":8080" # Health check endpoint (optional)
Apply the configuration:
helm upgrade --install spirl-system \
oci://ghcr.io/spirl/charts/spirl-system \
--values agent-values.yaml
Enable metrics via command-line flags:
--telemetry-metrics-api-listen-addr=":9090" \
--telemetry-enable-grpc-latency-monitoring=false \
--health-listen-addr=":8080"
Or environment variables (dashes replaced with underscores):
TELEMETRY_METRICS_API_LISTEN_ADDR=":9090"
TELEMETRY_ENABLE_GRPC_LATENCY_MONITORING=false
HEALTH_LISTEN_ADDR=":8080"
Or configuration file:
telemetry-metrics-api-listen-addr: ":9090"
telemetry-enable-grpc-latency-monitoring: false
health-listen-addr: ":8080"
Then start the agent with:
--config-file-path=/path/to/agent-config.yaml
# Or via environment variable:
# CONFIG_FILE_PATH=/path/to/agent-config.yaml
Verifying Metrics Endpoint​
- Helm Installation
- Linux Installation
Test that metrics are accessible:
# Locate an agent pod
kubectl -n spirl-system get po -l app=spirl-agent
# Port-forward to an agent pod (replace xxxxx with a pod name from above)
# This will block the shell with the port-forward; press ctrl+c to end the port-forward session
kubectl port-forward -n spirl-system spirl-agent-xxxxx 9090:9090
# In a separate shell, query the metrics endpoint
curl http://localhost:9090/metrics
Example:
> kubectl -n spirl-system get po -l app=spirl-agent
NAME READY STATUS RESTARTS AGE
spirl-agent-cw78f 1/1 Running 0 2d2h
> kubectl port-forward -n spirl-system spirl-agent-cw78f 9090:9090
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090
> curl http://localhost:9090/metrics
# HELP go_gc_duration_seconds A summary of the wall-time pause (stop-the-world) duration in garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.000582792
go_gc_duration_seconds{quantile="0.25"} 0.00085675
go_gc_duration_seconds{quantile="0.5"} 0.002005745
go_gc_duration_seconds{quantile="0.75"} 0.088068597
go_gc_duration_seconds{quantile="1"} 2.609688464
go_gc_duration_seconds_sum 25.650164805
go_gc_duration_seconds_count 497
...
Test that metrics are accessible locally:
# Query the metrics endpoint directly
curl http://localhost:9090/metrics
If the agent is running on a remote host, you can query it via SSH:
# Query metrics on remote host
ssh user@agent-host 'curl http://localhost:9090/metrics'
Example output:
> curl http://localhost:9090/metrics
# HELP go_gc_duration_seconds A summary of the wall-time pause (stop-the-world) duration in garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.000582792
go_gc_duration_seconds{quantile="0.25"} 0.00085675
go_gc_duration_seconds{quantile="0.5"} 0.002005745
go_gc_duration_seconds{quantile="0.75"} 0.088068597
go_gc_duration_seconds{quantile="1"} 2.609688464
go_gc_duration_seconds_sum 25.650164805
go_gc_duration_seconds_count 497
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 42
...
If you configured a different listen address (e.g., 0.0.0.0:9090), you can access metrics remotely. Ensure firewall rules allow access to the metrics port.
Key Metrics to Monitor​
Control Plane Connectivity​
grpc_client_handled_total- gRPC client requests to servergrpc_client_handling_seconds- Client request latency
Resource Utilization​
go_memstats_alloc_bytes- Current memory allocationgo_goroutines- Number of goroutinesprocess_cpu_seconds_total- CPU time
Kubernetes Runtime​
See Kubernetes Metrics for guidance on monitoring the Kubernetes runtime for issues.
Troubleshooting Agent Metrics​
Metrics Endpoint Not Accessible​
Verify telemetry is configured:
# Check for telemetry configuration in pod args
kubectl get pods -n spirl-system -l app=spirl-agent -o jsonpath='{.items[0].spec.containers[0].args}' | grep telemetry-metrics-api-listen-addr
Test the endpoint directly:
kubectl port-forward -n spirl-system <agent-pod-name> 9090:9090
curl http://localhost:9090/metrics
Next Steps​
- Server Metrics - Configure metrics for Trust Domain Servers
- Review All Metrics - Complete metrics reference