Monitoring and alerting with Apache Karaf Decanter

Some months ago, I proposed Decanter on the Apache Karaf Dev mailing list.

Today, Apache Karaf Decanter 1.0.0 first release is now on vote.

It’s the good time to do a presentation 😉

Overview

Apache Karaf Decanter is complete monitoring and alerting solution for Karaf and the applications running on it.

It’s very flexible, providing ready to use features, and also very easy to extend.

Decanter 1.0.0 release works with any Karaf version, and can also be used to monitor applications outside of Karaf.

Decanter provides collectors, appenders, and SLA.

Collectors

Decanter Collectors are responsible of harvesting the monitoring data.

Basically, a collector harvest the data, create an OSGi EventAdmin Event event send to decanter/collect/* topic.

A Collector can be:

  • Event Driven, meaning that it will automatically react to an internal event
  • Polled, meaning that it’s periodically executed by the Decanter Scheduler

You can install multiple Decanter Collectors in the same time. In the 1.0.0 release, Decanter provides the following collectors:

  • log is an event-driven collector. It’s actually a Pax Logging PaxAppender that listens for any log messages and send the log details into the EventAdmin topic.
  • jmx is a polled collector. Periodically, the Decanter Scheduler executes this collector. It retrieves all attributes of all MBeans in the MBeanServer, and send the JMX metrics into the EventAdmin topic.
  • camel (jmx) is a specific JMX collector configuration, that retrieves the metrics only for the Camel routes MBeans.
  • activemq (jmx) is a specific JMX collector configuration, that retrieves the metrics only for the ActiveMQ MBeans.
  • camel-tracer is a Camel Tracer TraceEventHandler. In your Camel route definition, you can set this trace event handler to the default Camel tracer. Thanks to that, all tracing details (from URI, to URI, exchange with headers, body, etc) will be send into the EventAdmin topic.

Appenders

The Decanter Appenders receives the data harvested by the collectors. They consume OSGi EventAdmin Events from the decanter/collect/* topics.

They are responsible of storing the monitoring data into a backend.

You can install multiple Decanter Appenders in the same time. In the 1.0.0 release, Decanter provides the following appenders:

  • log creates a log message with the monitoring data
  • elasticsearch stores the monitoring data into an Elasticsearch instance
  • jdbc stores the monitoring data into a database
  • jms sends the monitoring data to a JMS broker
  • camel sends the monitoring data to a Camel route

SLA and alerters

Decanter also provides an alerting system when some data doesn’t validate a SLA.

For instance, you can define the maximum acceptable number of threads running in Karaf. If the current number of threads is over the limit, Decanter calls alerters.

Decanter Alerters are a special kind of appenders, consuming events from the OSGi EventAdmin decanter/alert/* topics.

As for the appenders, you can have multiple alerters active at the same time. Decanter 1.0.0 release provides the following alerters:

  • log to create a log message for each alert
  • e-mail to send an e-mail for each alert
  • camel to execute a Camel route for each alert

Let see Decanter in action to have details how to install and use it !

Quick start

Decanter is pretty easy to install and provide “key turn” functionalities.

The first thing to do is to register the Decanter features repository in the Karaf instance:

karaf@root()> feature:repo-add mvn:org.apache.karaf.decanter/apache-karaf-decanter/1.0.0/xml/features

NB: for the next Karaf releases, I will add Decanter features repository in etc/org.apache.karaf.features.repos.cfg, allowing to easily register Decanter features simply using feature:repo-add decanter 1.0.0.

We now have the Decanter features available:

karaf@root()> feature:list |grep -i decanterdecanter-common                 | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter API                                decanter-simple-scheduler       | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Simple Scheduler                   decanter-collector-log          | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Log Messages Collector             decanter-collector-jmx          | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter JMX Collector                      decanter-collector-camel        | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Camel Collector                    decanter-collector-activemq     | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter ActiveMQ Collector                 decanter-collector-camel-tracer | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Camel Tracer Collector             decanter-collector-system       | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter OS Collector                       decanter-appender-log           | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Log Appender                       decanter-appender-elasticsearch | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Elasticsearch Appender             decanter-appender-jdbc          | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter JDBC Appender                      decanter-appender-jms           | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter JMS Appender                       decanter-appender-camel         | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter Camel Appender                     decanter-sla                    | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter SLA support                        decanter-sla-log                | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter SLA log alerter                    decanter-sla-email              | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter SLA email alerter                  decanter-sla-camel              | 1.0.0            |           | karaf-decanter-1.0.0     | Karaf Decanter SLA Camel alerter                  elasticsearch                   | 1.6.0            |           | karaf-decanter-1.0.0     | Embedded Elasticsearch node                       kibana                          | 3.1.1            |           | karaf-decanter-1.0.0     | Embedded Kibana dashboard

For a quick start, we will use elasticsearch embedded to store the monitoring data. Decanter provides a ready to use elasticsearch feature, starting an embedded elasticsearch node:

karaf@root()> feature:install elasticsearch

The elasticsearch feature installs the elasticsearch configuration: etc/elasticsearch.yml.

We now have a ready to use elasticsearch node, where we will store the monitoring data.

Decanter also provides a kibana feature, providing a ready to use set of kibana dashboards:

karaf@root()> feature:install kibana We can now install the Decanter Elasticsearch appender: this appender will get the data harvested by the collectors, and store it in elasticsearch:karaf@root()> feature:install decanter-appender-elasticsearch

The decanter-appender-elasticsearch feature also installs etc/org.apache.karaf.decanter.appender.elasticsearch.cfg file. You can configure the location of the Elasticsearch node there. By default, it uses a local elasticsearch node, especially the one embedded that we installed with the elasticsearch feature.

The etc/org.apache.karaf.decanter.appender.elasticsearch.cfg file contains hostname, port and clusterName of the elasticsearch instance to use:

################################################# Decanter Elasticsearch Appender Configuration################################################# Hostname of the elasticsearch instancehost=localhost# Port number of the elasticsearch instanceport=9300# Name of the elasticsearch clusterclusterName=elasticsearch

Now, our Decanter appender and elasticsearch node are ready.

It's now time to install some collectors to harvest the data.

Karaf monitoring

First, we install the log collector:

karaf@root()> feature:install decanter-collector-log 

This collector is event-driven and will automatically listen for log events, and send into the EventAdmin collect topic.

We install a second collector: the JMX collector.

karaf@root()> feature:install decanter-collector-jmx

The JMX collector is a polled collector. So, it also installs and starts the Decanter Scheduler.

You can define the call execution period of the scheduler in etc/org.apache.karaf.decanter.scheduler.simple.cfg configuration file. By default, the Decanter Scheduler calls the polled collectors every 5 seconds.

The JMX collector is able to retrieve all metrics (attributes) from multiple MBeanServers.

By default, it uses the etc/org.apache.karaf.decanter.collector.jmx-local.cfg configuration file. This file polls the local MBeanServer.

You can create new configuration files (for instance etc/org.apache.karaf.decanter.collector.jmx-mystuff.cfg configuration file), to poll other remote or local MBeanServers.

The etc/org.apache.karaf.decanter.collector.jmx-*.cfg configuration file contains:

type=jmx-mystuffurl=service:jmx:rmi:///jndi/rmi://hostname:1099/karaf-rootusername=karafpassword=karafobject.name=*.*:*

The type property is a free field allowing you to identify the source of the metrics.

The url property allows you to define the JMX URL. You can also use the local keyword to poll the local MBeanServer.
The username and password allows you to define the username and password to connect to the MBeanServer.

The object.name property is optional. By default, the collector harvests all the MBeans in the server. But you can filter to harvest only some MBeans (for instance org.apache.camel:context=*,type=routes,name=* to harvest only the Camel routes metrics).

Now, we can go in the Decanter Kibana to see the dashboards using the harvested data.

You can access to the Decanter Kibana using http://localhost:8181/kibana.

You have the Decanter Kibana welcome page:

Decanter Kibana

Decanter provides ready to use dashboard. Let see the Karaf Dashboard.

Decanter Kibana Karaf 1

These histograms use the metrics harvested by the JMX collector.

You can also see the log details harvested by the log collector:

Decanter Karaf 2

As Kibana uses Lucene, you can extract exactly the data that you need using filtering or queries.

You can also define the time range to get the metrics and logs.

For instance, you can create the following query to filter only the message coming from Elasticsearch:

loggerName:org.elasticsearch*

Camel monitoring and tracing

We can also use Decanter for the monitoring of the Camel routes that you deploy in Karaf.

For instance, we add Camel in our Karaf instance:

karaf@root()> feature:repo-add camel 2.13.2Adding feature url mvn:org.apache.camel.karaf/apache-camel/2.13.2/xml/featureskaraf@root()> feature:install camel-blueprint

In the deploy, we create the following very simple route (using the route.xml file):

<?xml version="1.0" encoding="UTF-8"?><blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">    <camelContext xmlns="http://camel.apache.org/schema/blueprint">        <route id="test">            <from uri="timer:fire?period=10000"/>            <setBody><constant>Hello World</constant></setBody>            <to uri="log:test"/>        </route>    </camelContext></blueprint>

Now, in Decanter Kibana, we can go in the Camel dashboard:

Decanter Kibana Camel 1

We can see the histograms here, using the JMX metrics retrieved on the Camel MBeans (especially, we can see for our route the exchanges completed, failed, the last processing time, etc).

You can also see the log messages related to Camel.

Another feature provided by Decanter is a Camel Tracer collector: you can enable the Decanter Camel Tracer to log all exchange state in the backend.

For that, we install the Decanter Camel Tracer feature:

karaf@root()> feature:install decanter-collector-camel-tracer

We update our route.xml in the deploy folder like this:

<?xml version="1.0" encoding="UTF-8"?><blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">    <reference id="eventAdmin" interface="org.osgi.service.event.EventAdmin"/>    <bean id="traceHandler" class="org.apache.karaf.decanter.collector.camel.DecanterTraceEventHandler">        <property name="eventAdmin" ref="eventAdmin"/>    </bean>    <bean id="tracer" class="org.apache.camel.processor.interceptor.Tracer">        <property name="traceHandler" ref="traceHandler"/>        <property name="enabled" value="true"/>        <property name="traceOutExchanges" value="true"/>        <property name="logLevel" value="OFF"/>    </bean>    <camelContext trace="true" xmlns="http://camel.apache.org/schema/blueprint">        <route id="test">            <from uri="timer:fire?period=10000"/>            <setBody><constant>Hello World</constant></setBody>            <to uri="log:test"/>        </route>    </camelContext></blueprint>

Now, in Decanter Kibana Camel dashboard, you can see the details in the tracer panel:

Decanter Kibana Camel 2

Decanter Kibana also provides a ready to use ActiveMQ dashboard, using the JMX metrics retrieved from an ActiveMQ broker.

SLA and alerting

Another Decanter feature is the SLA (Service Level Agreement) checking.

The purpose is to check if a harvested data validate a check condition. If not, an alert is created and send to SLA alerters.

We want to send the alerts to two alerters:

  • log to create a log message for each alert (warn log level for serious alerts, error log level for critical alerts)
  • camel to call a Camel route for each alert.

First, we install the decanter-sla-log feature:

karaf@root()> feature:install decanter-sla-log

The SLA checker uses the etc/org.apache.karaf.decanter.sla.checker.cfg configuration file.

Here, we want to throw an alert when the number of threads in Karaf is greater to 60. So in the checker configuration file, we set:

ThreadCount.error=range:[0,60]

The syntax in this file is:

attribute.level=check

where:

  • attribute is the name of the attribute in the harvested data (coming from the collectors).
  • level is the alert level. The two possible values are: warn or error.
  • check is the check expression.

The check expression can be:

  • range for numeric attribute, like range:[x,y]. The alert is thrown if the attribute is out of the range.
  • equal for numeric attribute, like equal:x. The alert is thrown if the attribute is not equal to the value.
  • notequal for numeric attribute, like notequal:x. The alert is thrown if the attribute is equal to the value.
  • match for String attribute, like match:regex. The alert is thrown if the attribute doesn't match the regex.
  • notmatch for String attribute, like nomatch:regex. The alert is thrown if the attribute match the regex.

So, in our case, if the number of threads is greater than 60 (which is probably the case ;)), we can see the following messages in the log:

2015-07-28 22:17:11,950 | ERROR | Thread-44        | Logger                           | 119 - org.apache.karaf.decanter.sla.log - 1.0.0 | DECANTER SLA ALERT: ThreadCount out of pattern range:[0,60]2015-07-28 22:17:11,951 | ERROR | Thread-44        | Logger                           | 119 - org.apache.karaf.decanter.sla.log - 1.0.0 | DECANTER SLA ALERT: Details: hostName:service:jmx:rmi:///jndi/rmi://localhost:1099/karaf-root | alertPattern:range:[0,60] | ThreadAllocatedMemorySupported:true | ThreadContentionMonitoringEnabled:false | TotalStartedThreadCount:5639 | alertLevel:error | CurrentThreadCpuTimeSupported:true | CurrentThreadUserTime:22000000000 | PeakThreadCount:225 | AllThreadIds:[J@6d9ad2c5 | type:jmx-local | ThreadAllocatedMemoryEnabled:true | CurrentThreadCpuTime:22911917003 | ObjectName:java.lang:type=Threading | ThreadContentionMonitoringSupported:true | ThreadCpuTimeSupported:true | ThreadCount:221 | ThreadCpuTimeEnabled:true | ObjectMonitorUsageSupported:true | SynchronizerUsageSupported:true | alertAttribute:ThreadCount | DaemonThreadCount:198 | event.topics:decanter/alert/error | 

Let's now extend the range, add a new check on the thread, and add a new check to throw alerts when we have errors in the log:

ThreadCount.error=range:[0,600]ThreadCount.warn=range:[0,300]loggerLevel.error=match:ERROR

Now, we want to call a Camel route to deal with the alerts.

We create the following Camel route, using the deploy/alert.xml:

<?xml version="1.0" encoding="UTF-8"?><blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">        <camelContext xmlns="http://camel.apache.org/schema/blueprint">                <route id="alerter">                        <from uri="direct-vm:decanter-alert"/>                        <to uri="log:alert"/>                </route>        </camelContext></blueprint>

Now, we can install the decanter-sla-camel feature:

karaf@root()> feature:install decanter-sla-camel

This feature also installs a etc/org.apache.karaf.decanter.sla.camel.cfg configuration file. In this file, you can define the Camel endpoint URI where you want to send the alert:

alert.destination.uri=direct-vm:decanter-alert

Now, let's decrease the thread range in etc/org.apache.karaf.decanter.sla.checker.cfg configuration file to throw some alerts:

ThreadCount.error=range:[0,600]ThreadCount.warn=range:[0,60]loggerLevel.error=match:ERROR

Now, in the log, we can see the alerts.

From the SLA log alerter:

2015-07-28 22:39:09,268 | WARN  | Thread-43        | Logger                           | 119 - org.apache.karaf.decanter.sla.log - 1.0.0 | DECANTER SLA ALERT: ThreadCount out of pattern range:[0,60]2015-07-28 22:39:09,268 | WARN  | Thread-43        | Logger                           | 119 - org.apache.karaf.decanter.sla.log - 1.0.0 | DECANTER SLA ALERT: Details: hostName:service:jmx:rmi:///jndi/rmi://localhost:1099/karaf-root | alertPattern:range:[0,60] | ThreadAllocatedMemorySupported:true | ThreadContentionMonitoringEnabled:false | TotalStartedThreadCount:6234 | alertLevel:warn | CurrentThreadCpuTimeSupported:true | CurrentThreadUserTime:193150000000 | PeakThreadCount:225 | AllThreadIds:[J@28f0ef87 | type:jmx-local | ThreadAllocatedMemoryEnabled:true | CurrentThreadCpuTime:201484424892 | ObjectName:java.lang:type=Threading | ThreadContentionMonitoringSupported:true | ThreadCpuTimeSupported:true | ThreadCount:222 | ThreadCpuTimeEnabled:true | ObjectMonitorUsageSupported:true | SynchronizerUsageSupported:true | alertAttribute:ThreadCount | DaemonThreadCount:198 | event.topics:decanter/alert/warn | 

but also from the SLA Camel alerter:

2015-07-28 22:39:15,293 | INFO  | Thread-41        | alert                            | 114 - org.apache.camel.camel-core - 2.13.2 | Exchange[ExchangePattern: InOnly, BodyType: java.util.HashMap, Body: {hostName=service:jmx:rmi:///jndi/rmi://localhost:1099/karaf-root, alertPattern=range:[0,60], ThreadAllocatedMemorySupported=true, ThreadContentionMonitoringEnabled=false, TotalStartedThreadCount=6236, alertLevel=warn, CurrentThreadCpuTimeSupported=true, CurrentThreadUserTime=193940000000, PeakThreadCount=225, AllThreadIds=[J@408db39f, type=jmx-local, ThreadAllocatedMemoryEnabled=true, CurrentThreadCpuTime=202296849879, ObjectName=java.lang:type=Threading, ThreadContentionMonitoringSupported=true, ThreadCpuTimeSupported=true, ThreadCount=222, event.topics=decanter/alert/warn, ThreadCpuTimeEnabled=true, ObjectMonitorUsageSupported=true, SynchronizerUsageSupported=true, alertAttribute=ThreadCount, DaemonThreadCount=198}]

Decanter also provides the SLA e-mail alerter to send the alerts by e-mail.

Now, you can play with the SLA checker, and add the checks on the attributes that you need. The Decanter Kibana dashboards help a lot there: in the "Event Monitoring" table, you can see all raw harvested data, allowing you to find the attributes.

What's next

It's just the first Decanter release, but I think it's an interesting one.

Now, we are in the process of adding:

  • a new Decanter CXF interceptor collector, thanks to this collector, you will be able to send details about the request/response on CXF endpoints (SOAP-Request, SOAP-Response, REST message, etc).
  • a new Decanter Redis appender, to send the harvested data to Redis
  • a new Decanter Cassandra appender, to send the harvested data to Cassandra
  • a Decanter WebConsole, allowing to easily manipulate the SLA
  • improvement the SLA support with "recovery" support to send only one alert when the check failed, and another alert when the value "recovered"

Anyway, if you have ideas and want to see new features in Decanter, please let us know.

I hope you like Decanter and see interest in this new Karaf project !

Comments

Popular posts from this blog

Quarkus and "meta" extension

Getting started with Apache Karaf Minho

Apache Karaf Minho and OpenTelemetry