Modern Open Source Messaging: Apache Kafka, RabbitMQ and NATS in Action

Last week I was in London to present at INTEGRATE 2016. While the conference is oriented towards Microsoft technology, I mixed it up by covering a set of messaging technologies in the open source space; you can view my presentation here. There’s so much happening right now in the messaging arena, and developers have never had so many tools available to process data and link systems together. In this post, I’ll recap the demos I built to test out Apache Kafka, RabbitMQ, and NATS.

One thing I tried to make clear in my presentation was how these open source tools differ from classic ESB software. A few things you’ll notice as you check out the technologies below:

  • Modern brokers are good at one thing. Instead of cramming in every sort of workflow, business intelligence, and adapter framework in the software, these OSS services are simply great at ingesting and routing lots of data. They are typically lightweight, but deployable in highly available configurations as well.
  • Endpoints now have significant responsibility. Traditional brokers took on tasks such as message transformation, reliable transportation, long-running orchestration between endpoints, and more. The endpoints could afford to be passive, mostly-untouched participants in the integration. These modern engines don’t cede control to a centralized bus, but rather use the bus only to transport opaque data. Endpoints need to be smarter and I think that’s a good thing.
  • Integration is approachable for ALL devs. Maybe a controversial opinion, but I don’t think integration work belongs in an” integration team.” If agility matters to you, then you can’t silo off a key function and force (micro)service teams to line up to get their work done. Integration work needs to be democratized, and many of these OSS tools make integration very approachable to any developer.

Apache Kafka

With Kafka you can do both real-time and batch processing. Ingest tons of data, route via publish-subscribe (or queuing). The broker barely knows anything about the consumer. All that’s really stored is an “offset” value that specifies where in the log the consumer left off. Unlike many integration brokers that assume consumers are mostly online, Kafka can successfully persist a lot of data, and supports “replay” scenarios. The architecture is fairly unique; topics are arranged in partitions (for parallelism), and partitions are replicated across nodes (for high availability).

For this set of demos, I used Vagrant to stand up an Ubuntu box, and then installed both Zookeeper and Kafka. I also installed Kafka Manager, built by the engineering team at Yahoo!. I showed the conference audience this UI, and then added a new topic to hold server telemetry data.

2016-05-17-messaging01

All my demo apps were built with Node.js. For the Kafka demos, I used the kafka-node module. The consumer is simple: write 50 messages to our “server-stats” Kafka topic.

var serverTelemetry = {server:'zkee-022', cpu: 11.2, mem: 0.70, storage: 0.30, timestamp:'2016-05-11:01:19:22'};

var kafka = require('../../node_modules/kafka-node'),
    Producer = kafka.Producer,
    client = new kafka.Client('127.0.0.1:2181/'),
    producer = new Producer(client),
    payloads = [
        { topic: 'server-stats', messages: [JSON.stringify(serverTelemetry)] }
    ];

producer.on('error', function (err) {
    console.log(err);
});
producer.on('ready', function () {

    console.log('producer ready ...')
    for(i=0; i<50; i++) {
        producer.send(payloads, function (err, data) {
            console.log(data);
        });
    }
});

Before consuming the data (and it doesn’t matter that we haven’t even defined a consumer yet; Kafka stores the data regardless), I showed that the topic had 50 messages in it, and no consumers.

2016-05-17-messaging02

Then, I kicked up a consumer with a group ID of “rta” (for real time analytics) and read from the topic.

var kafka = require('../../node_modules/kafka-node'),
    Consumer = kafka.Consumer,
    client = new kafka.Client('127.0.0.1:2181/'),
    consumer = new Consumer(
        client,
        [
            { topic: 'server-stats' }
        ],
        {
            groupId: 'rta'
        }
    );

    consumer.on('message', function (message) {
        console.log(message);
    });

After running the code, I could see that my consumer was all caught up, with no “lag” in Kafka.

2016-05-17-messaging03
For my second Kafka demo, I showed off the replay capability. Data is removed from Kafka based on an expiration policy, but assuming the data is still there, consumers can go back in time and replay from any available point. In the code below, I go back to offset position 40 and read everything from that point on. Super useful if apps fail to process data and you need to try again, or if you have batch processing needs.

var kafka = require('../../node_modules/kafka-node'),
    Consumer = kafka.Consumer,
    client = new kafka.Client('127.0.0.1:2181/'),
    consumer = new Consumer(
        client,
        [
            { topic: 'server-stats', offset: 40 }
        ],
        {
            groupId: 'rta',
            fromOffset: true
        }
    );

    consumer.on('message', function (message) {
        console.log(message);
    });

RabbitMQ

RabbitMQ is a messaging engine that follows the AMQP 0.9.1 definition of a broker. It follows a standard store-and-forward pattern where you have the option to store the data in RAM, on disk, or both. It supports a variety of message routing paradigms. RabbitMQ can be deployed in a clustered fashion for performance, and mirrored fashion for high availability. Consumers listen directly on queues, but publishers only know about “exchanges.” These exchanges are linked to queues via bindings, which specify the routing paradigm (among other things).

For these demos (I only had time to do two of them at the conference), I used Vagrant to build another Ubuntu box and installed RabbitMQ and the management console. The management console is straightforward and easy to use.

2016-05-17-messaging04

For the first demo, I did a publish-subscribe example. First, I added a pair of queues: notes1 and notes2. I then showed how to create an exchange. In order to send the inbound message to ALL subscribers, I used a fanout routing type. Other options include direct (specific routing key), topic (depends on matching a routing key pattern), or headers (route on message headers).

2016-05-17-messaging05

I have an option to bind this exchange to another exchange or a queue. Here, see that I bound it to the new queues I created.

2016-05-17-messaging06

Any message into this exchange goes to both bound queues. My Node.js application used the amqplib module to publish a message.

var amqp = require('../../node_modules/amqplib/callback_api');

amqp.connect('amqp://rseroter:rseroter@localhost:5672/', function(err, conn) {
  conn.createChannel(function(err, ch) {
    var exchange = 'integratenotes';

    //exchange, no specific queue, message
    ch.publish(exchange, '', new Buffer('Session: Open-source messaging. Notes: Richard is now showing off RabbitMQ.'));
    console.log(" [x] Sent message");
  });
  setTimeout(function() { conn.close(); process.exit(0) }, 500);
});

As you’d expect, both apps listening on the queue got the message. The second demo used a direct routing with specific routing keys. This means that each queue will receive messages if their binding matches the provided routing key.

2016-05-17-messaging07

That’s handy, but you can also use a topic binding and then apply wildcards. This helps you route based on a pattern matching scenario. In this case, the first queue will receive a message if the routing key has 3 sections separated by a period, and the third value equals “slides.” So “day2.seroter.slides” would work, but “day2.seroter.recap” wouldn’t. The second queue gets a message only if the routing key starts with “day2”, has any middle value, and then “recap.”

2016-05-17-messaging08

NATS

If you’re looking for a high velocity communication bus that supports a host of patterns that aren’t realistic with traditional integration buses, then NATS is worth a look! NATS was originally built with Ruby and achieved a respectable 150k messages per second. The team rewrote it in Go, and now you can do an absurd 8-11 million messages per second. It’s tiny, just a 3MB Docker image! NATS doesn’t do persistent messaging; if you’re offline, you don’t get the message. It works as a publish-subscribe engine, but you can also get synthetic queuing. It also aggressively protects itself, and will auto-prune consumers that are offline or can’t keep up.

In my first demo, I did a poor man’s performance test. To be clear, this is not a good performance test. But I wanted to show that even a synchronous loop in Node.js could achieve well over a million messages per second. Here I pumped in 12 million messages and watched the stats using nats-top.

2016-05-17-messaging09

1.6 million messages per second, and barely using any CPU. Awesome.

The next demo was a new type of pattern. In a microservices world, it’s important to locate service at runtime, not hard-code references to them at design time. Solutions like Consul are great, but if you have a performant message bus, you can actually use THAT as the right loosely coupled intermediary. Here, an app wants to look up a service endpoint, so it publishes a request and waits to hear back from which service instances are online.

// Server connection
var nats = require('../../node_modules/nats').connect();

console.log('startup ...');

nats.request('service2.endpoint', function(response) {
    console.log('service is online at endpoint: ' + response);
});

Each microservice then has a listener attached to NATS and replies if it gets an “are you online?” request.

// Server connection
var nats = require('../../node_modules/nats').connect();

console.log('startup ...');

//everyone listens
nats.subscribe('service2.endpoint', function(request, replyTo) {
    nats.publish(replyTo, 'available at http://www.seroter.com/service2');
});

When I called the endpoint, I got back a pair of responses, since both services answered. The client then chooses which instance to call. Or, I could put the service listeners into a “queue group” which means that only one subscriber gets the request. Given that consumers aren’t part of the NATS routing table if they are offline, I can be confident that whoever responds, is actually online.

// Server connection
var nats = require('../../node_modules/nats').connect();

console.log('startup ...');

//subscribe with queue groups, so that only one responds
nats.subscribe('service2.endpoint', {'queue': 'availability'}, function(request, replyTo) {
    nats.publish(replyTo, 'available at http://www.seroter.com/service2');
});

It’s a cool pattern. It only works if you can trust your bus. Any significant delay introduced by the messaging bus, and your apps slow to a crawl.

Summary

I came away from INTEGRATE 2016 impressed with Microsoft’s innovative work with integration-oriented Azure services. Event Hubs, Logic Apps and the like are going to change how we process data and events in the cloud. For those who want to run their own engines – and at no commercial licensing cost – it’s exciting to explore the open source domain. Kafka, RabbitMQ, and NATS are each different, and may complement your existing integration strategy, but they’re each worth a deep look!



Categories: Cloud, DevOps, General Architecture, Messaging, Microservices, OSS

2 replies

  1. Nice Article.

Trackbacks

  1. This Week in Spring – May 17th, 2016 | Alexius DIAKOGIANNIS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: