Case Study: Approaches to Logging Architecture

There are several approaches to design a logging architecture.

Best option depends obviously on the specific requirements.

In this blog post I won’t tell which logging architecture is best but I will give you options of what’s used in the industry.

Summary of approaches Native Option

First approache is to go all with AWS as Ancestry did ( With this option you will end up using kinesis for data streaming, which means you will probably also have Lambda functions (mapping to shards) and passing the results to another kinesis stream or storing them in Elasticsearch or S3.

Since this is AWS based, then API Gateway is needed for the Lambda Functions to work as Services.

Portable option

This is what companies like Pinterest do ( With this option usually kafka is used for data streamming as depicted in the image below.

For data processing there are few options, like Spark or Storm ( TODO: add sample) to read from kafka’s partititons as they do at Airbnb (link here).


The following table summarizes the approaches (no duplicate entries).


kineses VS kafka:

Sample spark job:

Sample ElasticSearch with Java:

Sample HBase with Java:

Sampel streamming with kafka and Spark:

Hive is a data warehouse software and HBase is a column-oriented database

Other options to Review later Generic
AWS Specific Streaming, Flexible Log Parsing with Real-Time Application

Originally published at on September 4, 2020.

Hands-on Sr Software Manager / Architect based in Ireland. Views are my own. Linkedin: Twitter: