Apache Karaf Minho and OpenTelemetry

Telemetry is an important part of a software stack, especially on distributed system (where services call each others).
OpenTelemetry is a CNCF project coming from OpenTracing and OpenCensus. It provides API to easily instrument your code and tools to collect and export telemetry data (metrics, logs and traces).
Before going into how to use OpenTelemetry with Minho, we have to understand some key concepts:
  • Span is a single building block representing an operation or some unit of work that you want to capture. They are capable of standing on their own, referencing (or following) from other spans, storing metadata, tags, etc.
  • Trace is a visualisation of the life of a request (or series of other operations) that is made up of one or more spans. This collection of spans work together to paint a picture of the overall request, allowing you to easily reconstruct what happened within each span, etc.
  • SpanContext is a wrapper of key values pairs that associates a trace to one or more spans and is the key ingredient when carrying traces across data-boundaries (different systems, ..). This is the primary component that we store and work with in the context of a distributed system.

There are basically two ways to instrument your code:
  • automatic: you just add the OpenTelemetry agent to the Minho runtime and it will get our defined spans. You can use annotations to automatically add span on your methods/code.
  • manual: here you fully control the span, adding attributes and events you want, and you can also "control" the tracers and exporters. In Minho, you simply create a Service that bootstrap OpenTelemetry SDK.

In this blog post, you will see the two instrumentation ways in Minho and we will use Jaeger as visualization tool.
As usual, the blog source code is available on Github: https://github.com/jbonofre/minho-blog.

Automatic instrumentation with agent and annotation

To illustrate OpenTelemetry in Minho, we will create a simple servlet that call another Minho service in the runtime.
First, let's create a very simple Minho service called AutoInstrumentedService:

package net.nanthrax.blog.minho.telemetry.service;

import io.opentelemetry.instrumentation.annotations.SpanAttribute;
import io.opentelemetry.instrumentation.annotations.WithSpan;
import org.apache.karaf.minho.boot.spi.Service;

public class AutoInstrumentedService implements Service {

    @WithSpan
    public String processing(@SpanAttribute("message") String message) {
        return inner(message);
    }

    @WithSpan
    public String inner(@SpanAttribute("message") String message) {
        return message.toUpperCase();
    }

}


Here we create a Minho service with a method calling another one. We use OpenTelementry @WithSpan annotation. It automatically creates a span when our method is called. @SpanAttribute allows us to add an attribute to the span. The attributes will be part of the span and it's good practice to use attributes to add useful information to the span (method arguments, etc).
About the dependencies to add, we just need io.telemetry.instrumentation:opentelemetry-instrumentation-annotations artifact. Very soon, Minho will provide minho-opentelemetry service bringing all required dependencies and configuration service you need.
That's it !
Now to illusrtate telemetry with trace, let's add a simple servlet just calling our inner AutoInstrumentedService:

package net.nanthrax.blog.minho.telemetry.servlet;

import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServlet;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import net.nanthrax.blog.minho.telemetry.service.AutoInstrumentedService;
import org.apache.karaf.minho.boot.Minho;
import org.apache.karaf.minho.boot.spi.Service;

import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;
import java.util.Properties;

public class SimpleServlet extends HttpServlet implements Service {

    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
        AutoInstrumentedService innerService = Minho.getInstance().getServiceRegistry().get(AutoInstrumentedService.class);

        String message = innerService.processing("Hello World");

        response.setStatus(200);
        try (Writer writer = new OutputStreamWriter(response.getOutputStream())) {
            writer.write(message);
        }
    }

    @Override
    public Properties properties() {
        Properties properties = new Properties();
        properties.put("contextPath", "/simple");
        return properties;
    }

}


We can now start Minho runtime adding the OpenTelemetry agent:

$ java -javaagent:.m2/repository/io/opentelemetry/javaagent/opentelemetry-javaagent/1.22.1/opentelemetry-javaagent-1.22.1.jar -Dotel.service.name=minho-blog -Dotel.traces.exporter=logging -jar minho-boot.jar

Now, we access to http://localhost:8080/simple. In the Minho log, we can see:

[otel.javaagent 2023-02-08 10:47:29:799 +0100] [minho-http-26] INFO io.opentelemetry.exporter.logging.LoggingSpanExporter - 'AutoInstrumentedService.inner' : 76f9b61a046db41cae72e6a741d816f8 5631469972b9aed7 INTERNAL [tracer: io.opentelemetry.opentelemetry-instrumentation-annotations-1.16:1.22.1-alpha] AttributesMap{data={thread.id=26, message=Hello World, thread.name=minho-http-26, code.namespace=net.nanthrax.blog.minho.telemetry.service.AutoInstrumentedService, code.function=inner}, capacity=128, totalAddedValues=5}
[otel.javaagent 2023-02-08 10:47:29:799 +0100] [minho-http-26] INFO io.opentelemetry.exporter.logging.LoggingSpanExporter - 'AutoInstrumentedService.processing' : 76f9b61a046db41cae72e6a741d816f8 a80a3899dd69a014 INTERNAL [tracer: io.opentelemetry.opentelemetry-instrumentation-annotations-1.16:1.22.1-alpha] AttributesMap{data={thread.id=26, message=Hello World, thread.name=minho-http-26, code.namespace=net.nanthrax.blog.minho.telemetry.service.AutoInstrumentedService, code.function=processing}, capacity=128, totalAddedValues=5}
[otel.javaagent 2023-02-08 10:47:29:809 +0100] [minho-http-26] INFO io.opentelemetry.exporter.logging.LoggingSpanExporter - '/simple' : 76f9b61a046db41cae72e6a741d816f8 55f9dbaf9f721669 SERVER [tracer: io.opentelemetry.jetty-11.0:1.22.1-alpha] AttributesMap{data={net.sock.peer.port=65467, thread.id=26, http.method=GET, http.scheme=http, net.host.port=8080, net.sock.host.addr=[0:0:0:0:0:0:0:1], http.status_code=200, http.user_agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36, http.route=/simple, thread.name=minho-http-26, net.host.name=localhost, net.transport=ip_tcp, http.target=/simple, net.sock.peer.addr=[0:0:0:0:0:0:0:1], http.flavor=1.1}, capacity=128, totalAddedValues=15}
[otel.javaagent 2023-02-08 10:47:29:831 +0100] [minho-http-20] INFO io.opentelemetry.exporter.logging.LoggingSpanExporter - 'Response.sendError' : 531c4dcb80dedfafb7750596e27b9a6b af4588a54d57f75a INTERNAL [tracer: io.opentelemetry.servlet-5.0:1.22.1-alpha] AttributesMap{data={thread.id=20, thread.name=minho-http-20, code.namespace=org.eclipse.jetty.server.Response, code.function=sendError}, capacity=128, totalAddedValues=4}
[otel.javaagent 2023-02-08 10:47:29:834 +0100] [minho-http-20] INFO io.opentelemetry.exporter.logging.LoggingSpanExporter - '/*' : 531c4dcb80dedfafb7750596e27b9a6b c2773844fcd59c1d SERVER [tracer: io.opentelemetry.jetty-11.0:1.22.1-alpha] AttributesMap{data={net.sock.peer.port=65467, thread.id=20, http.method=GET, http.scheme=http, net.host.port=8080, net.sock.host.addr=[0:0:0:0:0:0:0:1], http.status_code=404, http.user_agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36, http.route=/*, thread.name=minho-http-20, net.host.name=localhost, net.transport=ip_tcp, http.target=/favicon.ico, net.sock.peer.addr=[0:0:0:0:0:0:0:1], http.flavor=1.1}, capacity=128, totalAddedValues=15}

In the log messages, we can see:
  1. we can see a tracer with the spans corresponding to our inner service method calls (code.function=inner and code.function=processing)
  2. we can see spans coming from Minho services, like the HTTP service, like Jetty, where we have all details about the service

OK, we have the tracer and spans exported via the logging exporter. But it's not super easy to ready. We can use the OTLP exporter instead of logging. The OTLP exporter can use HTTP or gRPC to "push" the metrics to a visualization platform, like Jaeger.
So, let's start a Jaeger docker container accepting OTLP data:

$ docker run --name jaeger \     
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 16686:16686 \
  -p 4317:4317 \
  -p 4318:4318 \
  jaegertracing/all-in-one:1.35

We can access Jaeger console on http://localhost:16686.
Now, we change the exporter in Minho runtime to use OTLP instead of logging:

$ java -javaagent:.m2/repository/io/opentelemetry/javaagent/opentelemetry-javaagent/1.22.1/opentelemetry-javaagent-1.22.1.jar -Dotel.service.name=minho-blog -Dotel.traces.exporter=otlp -jar minho-boot.jar

And we access again to the servlet on http://localhost:8080/simple. We can now see the traces in Jaeger console:

Manual instrumentation

Automatic instrumentation is very easy to setup, but it has some limitations (limited control, not easy to define events, ThreadLocal use, etc).
It's possible to use "manual" instrumentation where you can have a better control on the span.
Let's rename our SimpleServlet as InstrumentedServlet:

package net.nanthrax.blog.minho.telemetry.servlet;

import io.opentelemetry.api.OpenTelemetry;
import io.opentelemetry.api.common.AttributeKey;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.api.trace.SpanKind;
import io.opentelemetry.api.trace.Tracer;
import io.opentelemetry.context.Context;
import io.opentelemetry.context.Scope;
import io.opentelemetry.context.propagation.TextMapGetter;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServlet;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import net.nanthrax.blog.minho.telemetry.service.AutoInstrumentedService;
import org.apache.karaf.minho.boot.Minho;
import org.apache.karaf.minho.boot.spi.Service;

import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;
import java.util.Collections;
import java.util.Properties;

public class InstrumentedServlet extends HttpServlet implements Service {

    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {

        OpenTelemetry openTelemetry = Minho.getInstance().getServiceRegistry().get(OpenTelemetryLoader.class).getOpenTelemetry();
        Tracer tracer = openTelemetry.getTracer(InstrumentedServlet.class.getName());
        AutoInstrumentedService innerService = Minho.getInstance().getServiceRegistry().get(AutoInstrumentedService.class);

        Context context = openTelemetry.getPropagators().getTextMapPropagator().extract(Context.current(), request, new TextMapGetter<HttpServletRequest>() {
            @Override
            public Iterable<String> keys(HttpServletRequest carrier) {
                return Collections.list(carrier.getHeaderNames());
            }

            @Override
            public String get(HttpServletRequest carrier, String key) {
                return carrier.getHeader(key);
            }
        });

        Span span = tracer.spanBuilder("InstrumentedServlet.GET").setParent(context).setSpanKind(SpanKind.SERVER).startSpan();

        try (Scope scope = span.makeCurrent()) {
            span.setAttribute("service", "instrumented-servlet");
            span.setAttribute("http.method", "GET");
            span.setAttribute("http.scheme", "http");
            span.setAttribute("http.host", "localhost:8080");
            span.setAttribute("http.target", "/observed");

            span.addEvent("Call inner service");

            String message = innerService.processing("Hello World");

            span.addEvent("Inner service responded", Attributes.of(AttributeKey.stringKey("message"), message));

            span.addEvent("Create HTTP response");

            response.setStatus(200);
            try (Writer writer = new OutputStreamWriter(response.getOutputStream())) {
                writer.write(message);
            }

            span.addEvent("HTTP response sent", Attributes.of(AttributeKey.stringKey("message"), message));

        } finally {
            span.end();
        }
    }

    @Override
    public Properties properties() {
        Properties properties = new Properties();
        properties.put("contextPath", "/observed");
        return properties;
    }

}

We can see how to create a span and populate it with attributes and events. To do that, we need a tracer. We get this tracer from the OpenTelemetry SDK.
Let's initialize the OpenTelemetry SDK in a dedicated service named OpenTelemetryLoader:

package net.nanthrax.blog.minho.telemetry.servlet;

import io.opentelemetry.api.OpenTelemetry;
import io.opentelemetry.api.common.AttributeKey;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.trace.propagation.W3CTraceContextPropagator;
import io.opentelemetry.context.propagation.ContextPropagators;
import io.opentelemetry.exporter.otlp.http.trace.OtlpHttpSpanExporter;
import io.opentelemetry.sdk.OpenTelemetrySdk;
import io.opentelemetry.sdk.resources.Resource;
import io.opentelemetry.sdk.trace.SdkTracerProvider;
import io.opentelemetry.sdk.trace.export.SimpleSpanProcessor;
import org.apache.karaf.minho.boot.service.ServiceRegistry;
import org.apache.karaf.minho.boot.spi.Service;

public class OpenTelemetryLoader implements Service {

    private OpenTelemetry openTelemetry;

    @Override
    public void onRegister(ServiceRegistry serviceRegistry) {
        SdkTracerProvider sdkTracerProvider =
                SdkTracerProvider.builder()
                        .setResource(Resource.create(Attributes.of(AttributeKey.stringKey("service.name"), "minho-blog")))
                        .addSpanProcessor(SimpleSpanProcessor.create(OtlpHttpSpanExporter.getDefault()))
                        .build();

        OpenTelemetrySdk sdk =
                OpenTelemetrySdk.builder()
                        .setTracerProvider(sdkTracerProvider)
                        .setPropagators(ContextPropagators.create(W3CTraceContextPropagator.getInstance()))
                        .build();

        Runtime.getRuntime().addShutdownHook(new Thread(sdkTracerProvider::close));

        openTelemetry = sdk;
    }

    public OpenTelemetry getOpenTelemetry() {
        return openTelemetry;
    }

}

This loader replace the OpenTelemetry agent and it's where you can initialize the SDK, adding exporter, propagator, etc.
The AutoInstrumentedService doesn't change.
Let's rebuild and restart our Minho runtime. Then we access to http://localhost:8080/observed.
Now, in Jaeger, we can see more detailed spans:
We can see all details in our InstrumentedServlet with all timed events and the time spent in different parts of the request.

Summary

Using OpenTelemetry with Minho is already very easy. The minho-opentelemetry service will deal with OpenTelemetry loading and configuration (via minho.json) to avoid use of the agent.
New blog about Minho will be published soon (probably one around JSON-RPC/gRPC, another one about Guice in Minho, waiting Minho IoC).
Stay tuned !

Comments

Popular posts from this blog

Exposing Apache Karaf configurations with Apache Arrow Flight

Getting started with Apache Karaf Minho

Using Apache Karaf with Kubernetes