Traces
K8GB can create and send spans/traces to either Jaeger which is supported also on the Helm chart level or to any other OTEL compliant solution. We don't recommend sending the tracing data directly to the tracer, because of the possible vendor lock-in. OTEL collector can be deployed as a side-car container for k8gb pod and forward all the traces to a configurable sink or event multiple sinks for redundancy.
Architecture
We are not opinionated about the OpenTracing vendors. In the following diagram X
can be Jaeger, Zipkin, LigthStep, Grafana's Tempo, Instana, etc. It can also be another OTEL collector to form more sophisticated pipeline setup.
Sidecar use-case:
+--------------+ +------------+ +----------+
| k8gb | | X | http | User |
| ------------ | otlp | +------------>| |
| OTEL sidecar +-------------->| | | |
+--------------+ +------------+ +----------+
Deployment
By default the tracing is disabled and no sidecar container is being created during the k8gb deployment. To
enable the tracing, one has to set the tracing.enabled=true
in Helm Chart. This will create the sidecar container for k8gb deployment, tweaks couple of env vars there. It will create the configmap for OTEL sidecar. This configuration of OTEL collector can be overridden by tracing.otelConfig
.
If you need something quickly up and running, make sure that tracing.deployJaeger
is also set to true
.
In this scenario you will end up also with Jaeger deployed and service for it. To be able to access it one can continue with:
kubectl port-forward svc/jaeger-collector 16686
open http://localhost:16686
Also both sidecar container image and jaeger deployment's container image can be tweaked by
tracing.{sidecarImage,jaegerImage}.{repository,tag,pullPolicy}
.
Custom Architecture
In case you have already a OTEL collector present in the Kubernetes cluster and you do not want to introduce a new one, you can deploy also the following topology:
+----------+ +----------------+ +--------+ +---------+
| k8gb | | OTEL collector | otlp | X | http | User |
| | otlp | +------------>| +------------>| |
| +-------------->| | | | | |
+----------+ +----------------+ +--------+ +---------+
However, we don't support this use-case on the Helm chart level so you are on your own with the setup. Nonetheless, it should be relatively straightforward. All you have to do is set the following env vars for k8gb deployment:
TRACING_ENABLED
(set it totrue
)OTEL_EXPORTER_OTLP_ENDPOINT
(host:port
of OTEL collector from the ASCII diagram)TRACING_SAMPLING_RATIO
(optional, float representing the % ratio of how many traces are being collected)
Distributed Tracing & Context Propagation
K8gb is a k8s controller/operator so in nature event-based system, where the invocation of the request is done
using creating custom resource (gslb
) or Ingress
with certain annotation. In more traditional
micro-service world, the context propagation between traced systems is done using HTTP headers. Provided that
the comm client is also instrumented with OpenTracing bits one doesn't have to call the Extract
and
Inject
on his own. However, for the CRD space nothing has been introduced so far and having the tracing
key-value metadata stored in each custom resource would be overkill here. Not speaking about increasing the
overall complexity of such a system.
If k8gb had a REST api, it would be a very low-hanging fruit on the other hand.
As for the propagation of context down for the calls that k8gb does, this may make sense for direct HTTP
calls to external systems such as Infoblox or Route53. Then provided that those systems also support
OpenTracing and they have the context propagation done right on their part, one could see the full insight
into the requests and examine what takes most of the time or where the issue happened. As for communication
with ExternalDNS
, it's again on the "CRD level" -> hard to achieve + the ExternalDNS operator is not traced.