Monitoring Apache Kafka with Prometheus

TLDR, show me code  github-mark-120px-plus  kafka-prometheus-monitoring  

Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.  It is scaleable, durable and distributed by design which is why it is currently one of the most popular choices when choosing a messaging broker for high throughput architectures.

One of the major differences with Kafka is the way it manages state of the consumers, this itself is distributed with the client responsible for keeping track of the messages they have consumed (this is abstracted by the high level consumer in later versions of Kafka with offsets stored in Zookeeper).  In contrast to more traditional MQ messaging technologies, this inversion of control takes considerable load off the server.

The scalability, speed and resiliency properties of Kafka is why it was chosen for a project I worked on for my most recent client Sky.  Our use case was for processing realtime user actions in order to provide personalised Recommendations for the NowTV end users, a popular web streaming service available on multiple platforms.  We needed a reliable way to monitor our Kafka cluster to help inform key performance indictors during NFT testing.

Prometheus JMX Collector

Prometheus is our monitoring tool of choice and Apache Kafka metrics  are exposed by each broker in the cluster via JMX, therefore we need a way to extract these metrics and expose them in a format suitable for Prometheus.  Fortunately prometheus.io provides a custom exporter for this.  The Prometheus JMX  Exporter is a lightweight web service which exposes Prometheus metrics via a HTTP GET endpoint.  On each request it scrapes the configured JMX server and transforms JMX mBean query results into Prometheus compatible time series data, which are then returned to the caller via HTTP.

The mBeans to scrape are controlled by a yaml configuration where you can provide a white/blacklist of metrics to extract and how to represent these in Prometheus, for example GAUGE or COUNTER.  The configuration can be tuned for your specific requirements, a list of all metrics can be found in the Kafka Operations documentation.  Here is what our configuration looked like:

lowercaseOutputName: true
jmxUrl: service:jmx:rmi:///jndi/rmi://{{ getv "/jmx/host" }}:{{ getv "/jmx/port" }}/jmxrmi
rules:
- pattern : kafka.network<type=Processor, name=IdlePercent, networkProcessor=(.+)><>Value
- pattern : kafka.network<type=RequestMetrics, name=RequestsPerSec, request=(.+)><>OneMinuteRate
- pattern : kafka.network<type=SocketServer, name=NetworkProcessorAvgIdlePercent><>Value
- pattern : kafka.server<type=ReplicaFetcherManager, name=MaxLag, clientId=(.+)><>Value
- pattern : kafka.server<type=BrokerTopicMetrics, name=(.+), topic=(.+)><>OneMinuteRate
- pattern : kafka.server<type=KafkaRequestHandlerPool, name=RequestHandlerAvgIdlePercent><>OneMinuteRate
- pattern : kafka.server<type=Produce><>queue-size
- pattern : kafka.server<type=ReplicaManager, name=(.+)><>(Value|OneMinuteRate)
- pattern : kafka.server<type=controller-channel-metrics, broker-id=(.+)><>(.*)
- pattern : kafka.server<type=socket-server-metrics, networkProcessor=(.+)><>(.*)
- pattern : kafka.server<type=Fetch><>queue-size
- pattern : kafka.server<type=SessionExpireListener, name=(.+)><>OneMinuteRate
- pattern : kafka.controller<type=KafkaController, name=(.+)><>Value
- pattern : kafka.controller<type=ControllerStats, name=(.+)><>OneMinuteRate
- pattern : kafka.cluster<type=Partition, name=UnderReplicated, topic=(.+), partition=(.+)><>Value
- pattern : kafka.utils<type=Throttler, name=cleaner-io><>OneMinuteRate
- pattern : kafka.log<type=Log, name=LogEndOffset, topic=(.+), partition=(.+)><>Value
- pattern : java.lang<type=(.*)>

In summary:

  • Prometheus JMX Exporter – scrapes the configured JMX server and transforms JMX mBean query results into Prometheus compatible time series data, exposes result via HTTP
  • JMX Exporter Configuration – a configuration file that filters the JMX properties to be transformed – example Kafka configuration
  • Prometheus – prometheus itself is configured to poll the JMX Exporter /metrics endpoint
  • Grafana – allows us to build rich dashboards from collected metrics

kafka-prometheus-white

 

Viewing Kafka Metrics

Once metrics have been scraped into Prometheus they can be browsed in the Prometheus UI, alternatively richer dashboards can be built using Grafana.

prometheus-ui

Prometheus Graph Builder

grafana-ui

Grafana Dashboard

In order to try this out locally, a fully dockerised example which has been provided on GitHub –  kafka-prometheus-monitoring.  This project is for demonstration purposes only and is not intended to be run in a production environment.  This is only scratching the surface of monitoring and fine-tuning the Kafka brokers but it is a good place to start in order to enable performance analysis of the cluster.

A note on monitoring a cluster of brokers:  Prometheus metrics will include a label which denotes the Brokers IP address, this allows you to distinguish metrics per broker.  Therefore a JMX exporter will need to be run for each broker and Prometheus should be configured to poll each deployed JMX exporter.

 

Scalaz Monad Transformers

Whilst gaining a deeper understanding of functional programming concepts and patterns I have found myself delving more deeply into the world of scalaz.  Today the agenda is monad transformers, after some initial reading I very quickly started to see patterns in our codebase which could immediately benefit from their application.

What is a monad transformer? My definition as a Software Engineer and not a Mathematician with a PhD in Category Theory is… a monad transformer is a monadic type that abstracts a monad which wraps another monad.  This may not sound like something that happens a lot but it’s surprisingly common, take Future for example, all these cases below are monads nested within a monad.

Future[Option[T]]
Future[List[T]]
Future[scalaz.\/[A, B]]

Lets take the following scenario, you have two Lists of integers, and you want to add every element in List A to every element in List B and get a final List C.  However both lists are wrapped in a Future. In pure Scala this may look something like this:

@ val x = Future.successful(List(1, 2, 3)) 
x: Future[List[Int]] = scala.concurrent.impl.Promise$KeptPromise@5da72d46
@ val y = Future.successful(List(4, 5, 6)) 
y: Future[List[Int]] = scala.concurrent.impl.Promise$KeptPromise@2896f3c5
@  
@ x.flatMap { listA 
    y.map { listB =>
      listA.flatMap { i =>
        listB.map { j =>
          i + j
        }
      }
    }
  } 
res17: Future[List[Int]] = Success(List(5, 6, 7, 6, 7, 8, 7, 8, 9))

Or alternatively using the syntactic sugar of a for comprehension:

@ for {
    listA <- x
    listB <- y
  } yield {
    for {
      i <- listA
      j <- listB
    } yield i + j
  } 
res18: Future[List[Int]] = Success(List(5, 6, 7, 6, 7, 8, 7, 8, 9))

Notice that there is no way of accessing the underlying data without having to map over the Future followed by the List, which leads to the nested code you see above, this situation is where monad transformers can help us.

In our example our top level type was a Future[List[Int]], when choosing which monad transformer to use, you always choose the inner most type, in this case List[Int] is our inner most type so we will use the ListT monad transformer.  The ListT apply function is as follows:

def apply[A](a: M[List[A]]) = new ListT[M, A](a)

Therefore we can use this to convert our Future[List[Int]] to the ListT monad type which in this case will be a ListT[Future, Int]. We can now write our addition in terms of the new monad type which has abstracted the mapping of the Future:

@ for {
    i <- ListT(x)
    j <- ListT(y)
  } yield i + j 
res20: ListT[Future, Int] = ListT(Success(List(5, 6, 7, 6, 7, 8, 7, 8, 9)))

Notice we returned ListT[Future, Int], as with any Functor, calling map will always return the same monad type wrapping the transformed value.   This allows you to chain/compose operations in terms of your monad transformers until you are ready to unwrap back to your original type, which can be done using the run method:

@ res20.run 
res21: Future[List[Int]] = Success(List(5, 6, 7, 6, 7, 8, 7, 8, 9))

In summary monad transformers give you a powerful abstraction to work on the underlying data of a monadic type when it itself is wrapped in a monad. It reduces code complexity and enhances readability by abstracting the wiring of drilling down into the nested datatypes. ScalaZ provides implementations of monad transformers for many types including EitherT, ListT, OptionT and ReaderT to name a few.

Moving from Java to Scala

Functional programming was nowhere to be seen during my time at University, nor in my professional career, that is up until the point Scala came out of its shell. Object oriented programming had ruled and the only time I had heard much about functional languages were stories from more experienced engineers, most of which usually started with “back in the day”. But in my mind “back in the day” translated to an era of floppy disk drives, ASM programming and low-level hackery, naturally I took no notice.

haskell

I remember sitting in a SkillsMatter training course with a Typesafe representative, we had just finished a chapter on the fundamental basics of Scala, I was completely bamboozled by the notion of a lambda, what had happened to the safe haven of verbose imperative code?

In the beginning

Coming from a team of solid Java developers I was initially quite apprehensive, early proof of concept work had been frustrating and getting setup was cumbersome. IntelliJ was tripping over its own heels, traversing Scala documentation that spanned multiple versions of libraries was a battle, compile times seemed sluggish and SBT was another build/dependency management tool to add to the ever-growing list. All this on top of pressures to deliver project features, it felt like we had jumped out of the frying pan and into the fire.

Diving into a full new technology stack and language can be rather daunting, especially when there is no prior expertise in your team.  Although most good developers are able to pickup Scala through self learning, having some formal training or even better, some short-term experienced Scala contractors (with the primary purpose to knowledge share) fast tracked the team to getting their skills off the ground.

Rather than move to a full Scala stack in one swoop we opted to take a more methodical approach.  First we replaced our Java based integration tests with Scala ones, this allowed the developers to learn the fundamentals of Scala without compromising any production projects or code.  Once the team was comfortable with the fundamentals of the language we decided to build our next production project in Scala.

Reaping the rewards

Although the surrounding landscape had initially left much to desire, the advantages of the language itself were soon apparent, to name but a few:

scala-codemore concise code – the syntax and features of Scala strip away a lot of the unnecessary boiler plate that Java enforces while native language features provide solutions to common engineering patterns. Case classes, traits, type inference, optional braces, partial functions, pattern matching, implicit conversions, lazy val and more, it all plays a part.

 

let’s get functional – Scala treats functions as first class citizens, the ability to compose functions, create anonymous functions and pass them around really makes the language much more powerful when designing your code architecture.

less time coding – Once the team became proficient with Scala, the productivity of our team improved, this is something I observed as a general improvement in my own velocity and my colleagues.

the JVM ecosystem – Scala is a JVM language which meant we still had all of the Java libraries at our disposable, this was particularly important for us as it reduced some of the pressure during the transition period. Familiar technologies such as Apache HTTP Commons, Camel, Spring, Apache CXF and many more could still be used which reduced risk and allowed a more methodical transition to a full Scala stack.

collecting like a boss – Scala collections are immutable by default and it’s not until we really started to use Scala that a majority of problems can be solved without the use of mutable ones. Immutable collections are much easier to reason about, there is less variables and branching statements to consider. On top of this all collections support the holy grail of the functional world, map.

a reduction in bugs…perhaps – this is quite a subjective statement and comes from my personal experience working on multiple commercial Java projects.  Logic bugs that we have seen being raised seemed reduced compared to similar past projects. By logic bugs I mean things like null pointer exceptions, undesired results due to errors in branching logic and just plain business logic mistakes.

I suspect this comes from the ability to write more concise and expressive code with focus on the “happy path” using the functional paradigm. On top of this Scala collections are immutable by default, immutably reduces complexity and makes testing easier, along with the ability to compose behaviour using higher order functions and traits.  All these factors seem to contribute to the quality of our systems which I believe had a direct impact on the types and quantity of bugs. Of course some of this reduction may be attributed to more focus on automated testing, better testing frameworks and engineering practices……or maybe it’s all just placebo, I’ll let you decide.

The landscape now

The landscape for Scala development has improved greatly over the past few years, Scala 2.11 seems to have stabilised, SBT continuous compile mode offsets the longer compilation time allowing you to get feedback fast, there have been major improvements in IntelliJ support, and we also have the Typesafe Activator, which is arguably the easiest way to get started with a Scala application. On top of this we still have the full JVM eco-system at our disposable.

I would like to stress that for the same reasons Scala allows you to write succulent and elegant code, it’s versatility can be a double-edged sword.

with great power comes great responsability

There are a lot of programmers out there who are incredibly smart, sometimes too smart. In the few short years I have been working with Scala, I have seen several examples where Engineers have been lulled into writing unnecessarily complex solutions at the expense of readability and maintainability. I think Scala is a language which can be more susceptible to abuse in this area but as with any language, code quality reviews and established coding standards can mitigate these issues. My advice to those adopting Scala is to carefully choose the patterns and features you adopt, make sure your engineers are comfortable and just take it slow.

A great starting point on providing style and implementation advice is the Twitter Effective Scala guide by Marius Eriksen, I couldn’t recommend it enough.  In my opinion, the best Scala learning resource on the web is The Neophyte’s Guide to Scala by Daniel Westheide, this blog series is excellent and I would highly recommend it to anybody currently learning Scala.

As a final note, I would like to leave you with a point I feel is rarely mentioned.   Programming in Scala is FUN. The language is powerful, flexible and versatile giving you the opportunity to solve problems in the most elegant way, this really does make coding fun again. I’m sure you will all agree, happier engineers are always a good thing in any tech organisation, who can argue with that?!

A Dive into Docker

With the rise of new development methodologies such as Continuous Delivery, long gone are the days where a Software Engineer pushes code into the abyss and hope it comes out unscathed on the other side.  We are seeing a shift in the industry where the traditional walls between Development, Quality Assurance and Operations are slowly being broken down, these roles are merging and we are seeing a new breed of Engineer.  The buzz word “DevOps” has become prominent in the industry and as a result we are seeing project development teams that are more agile, more efficient and able to respond more quickly to change.  This shift has led to a rise of new tools and frameworks to help us automate deployment, automate testing and standardise infrastructure.

One of the tools at the forefront of this transformation is Docker, Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications.  Before diving further into this practical exercise I would suggest having a read over What is Docker?

Before beginning the exercise you will need to install Docker,  I use boot2docker on MacOS, for further details on installation for your platform visit Docker Installation.  Another option is to use a cloud provider to run your docker host, Digital Ocean provide Docker ready servers running on the cloud for as little as $0.007/hour, this is an especially attractive option if you are limited by bandwidth or resources.


A few basics

Docker Image

A docker image is a read-only blue-print for a container, an example blue-print may be the Ubuntu operating system, or a CentOS one. Every container that you run in Docker will be based off a docker image.

Dockerfile

A Dockerfile contains code that tells Docker how to build a Docker image. Docker images are layered and so can be extended, this allows you to stack extra functionality on top of existing base images. A commonly used base image is ubuntu:latest which is a blue-print of the base installation of an Ubuntu distribution.

Docker Container

A docker container can be thought of as a light weight self-contained instance of a virtual machine running a linux distribution (usually with modifications), they are extremely cheap to start and stop.  Docker containers are spawned using a docker image, they should be considered as stateless/ephemeral resources.

Docker Hub

Docker Hub brings Software Engineering DRY principles to the system infrastructure world, it is a global repository platform that holds Dockerfiles and images. There are already images available that run ubuntu, redhat, mysql, rabbitmq, mongodb, nginx to name just a few.


Diving into Docker

Let’s dive straight into Docker, we are going to build a simple infrastructure that will host a self-contained instance of WordPress, a popular blogging tool that is used by many organisations and writers across the world.  The infrastructure will include a nginx server to route/proxy requests, a WordPress application server to host the user interface and a MySQL database to provide storage.  Once complete our infrastructure will look something like this:

DockerCrashCourse


The database container

Let’s start by creating our MySQL database container, luckily for us MySQL has already been “dockerised” and is available for us to pull via Docker Hub, the defaults are fine so there is no need to write our own Dockerfile or build any new images.  A new container can be started using the docker run command.

The first run may take some time while images are downloaded, they will be cached for subsequent runs.

docker run --name wordpress-db -e MYSQL_ROOT_PASSWORD=mysecretpassword -d mysql
So what just happened here?  We asked Docker to run a new container using the MySQL base image:
-name the name/tag to assign the new container
-e this sets environment variables for the container, in this case the password for the MySQL instance, documentation for available configuration can be found in the MySQL Docker Hub documentation
-d this tells docker to run the container in the background as a detached process
mysql the name of the docker image to use, this is pulled from Docker Hub
Edit: Please note that in order to maintain any data across containers, a VOLUME should be configured to ensure data stays persistent.  For the sake of simplicity we will omit this flag but be aware deployments that involve state should carefully consider the durability of data across the life-cycle of containers.

The application container

Now let’s move onto running the WordPress application container, again this has already been “dockerised” and resides in the Docker Hub WordPress repository.
docker run --name wordpress-app --link wordpress-db:mysql -d wordpress
–link wordpress-db:mysql This tells Docker to create a network link to the wordpress-db container (which we created earlier), this makes network communications possible between the two containers.  The value has two parts, the left hand side signifies the container to connect to (wordpress-db), and the right hand sign represents a hostname alias from this container (mysql)
Let’s now run docker ps to see what containers we have running:
docker ps

CONTAINER ID        IMAGE               COMMAND                CREATED              STATUS              PORTS               NAMES
c39600354fcb        wordpress:latest    "/entrypoint.sh apac   About a minute ago   Up About a minute   80/tcp              wordpress-app       
20e66802e914        mysql:latest        "/entrypoint.sh mysq   About a minute ago   Up About a minute   3306/tcp            wordpress-db        
We can see two containers running as expected on the ports 80 and 3306, let’s ssh onto the wordpress-app container and check that we can talk to wordpress-db:
docker exec -i -t wordpress-app bash
ping mysql
64 bytes from 172.17.0.2: icmp_seq=0 ttl=64 time=0.085 ms
64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.127 ms
64 bytes from 172.17.0.2: icmp_seq=2 ttl=64 time=0.108 ms

Excellent, the wordpress-app container can talk to the wordpress-db container.  Exit the bash session, if desired you can check the logs for your running containers.

docker logs wordpress-app

Great, everything is looking good so far.


The nginx container

It is fairly common for many web applications to be fronted by a HTTP web proxy.  This provides advantages such as control of request routing, auditing, security, logging, caching, load balancing, hosting static content and more.  Nginx is a commonly used implementation of a HTTP web proxy server.  As we are creating a custom nginx we will need to create a new Dockerfile to define a new image that contains some custom nginx configuration:

mkdir wordpress-nginx
cd wordpress-nginx
vi default.conf

server {
    listen       80;
    server_name  localhost;

    error_log /var/log/nginx/error.log warn;

    location / {
        proxy_pass http://wordpress-app:80/;
        proxy_redirect http://server_name http://wordpress-app:80/;
        proxy_set_header   Host               $host;
        proxy_set_header   X-Forwarded-For    $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto  http;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}
Notice we have routed inbound requests from / to the wordpress-app container on port 80. Now let’s create a Dockerfile that defines how to build our nginx container image:
vi Dockerfile

FROM nginx
COPY default.conf /etc/nginx/conf.d/default.conf
FROM nginx the FROM instruction tells Docker to pull the base image of nginx from DockerHub
COPY default.conf /etc/nginx/conf.d/default.conf this command takes the file default.conf from the current directory and copies to the container image under /etc/nginx/conf.d/
Now all that is left is to build our new docker image and run a container using the image:
docker build -t wordpress-nginx .
docker run -d --name=wordpress-nginx --link=wordpress-app:wordpress-app -p 80:80 wordpress-nginx
docker ps

CONTAINER ID        IMAGE                    COMMAND                CREATED             STATUS              PORTS                         NAMES
2b9f99664249        wordpress-nginx:latest   "nginx -g 'daemon of   3 seconds ago       Up 2 seconds        443/tcp, 0.0.0.0:80->80/tcp   wordpress-nginx     
c39600354fcb        wordpress:latest         "/entrypoint.sh apac   9 minutes ago       Up 3 minutes        80/tcp                        wordpress-app       
20e66802e914        mysql:latest             "/entrypoint.sh mysq   9 minutes ago       Up 4 minutes        3306/tcp                      wordpress-db        
You may notice we gave the argument -p 80:80, this tells Docker to expose the port 80 on the container so it can be accessed externally from the docker host machine.

Hey Presto

Now browse to http://DOCKER_HOST_IP/ in your browser and voila, WordPress is ready to go, follow the WordPress setup prompts to configure your instance, you should soon see the following page ready to go:

Wordpress Admin Console

So to recap, we have learnt some of the fundamental concepts of Docker by making practical use of the resources available in Docker Hub to build a self-contained running instance of WordPress. All with just a few Docker commands. I hope this post will serve as a good introduction for you to start Dockerising your own applications infrastructure and to reap the many benefits that Docker brings.

If you enjoyed this post, I’d be very grateful if you’d help it spread by emailing it to a friend, or sharing it on Twitter or LinkedIn. Thank you for reading!

Edit: This post is also available in Chinese, thank you to dockerone.com for the translation – 深入浅出Docker(翻译:崔婧雯校对:李颖杰)