Introduction

Splunk (and more specifically, the data stored on it) is used for multiple purposes, including threat detection which is basically extract knowledge from logs, or better said: log collection, streaming and correlation.

Log collection is about developing/deploying data feed collectors and/or log shippers.

As part of log processing (commonly in the stream itself) log formats need to be parsed and enriched, which means that this stage is usually related with developing/deploying new parsers, data enrichments and normalisation to Common Event Format (CEF), Splunk Common Information Model (CIM), etc.

Lastly, correlation requires scripting (e.g. …


This post describes some highlights from the source material below, which I highly recommend you to read:

Link to Paper: http://cs.brown.edu/courses/csci2390/2019/readings/zanzibar.pdf

Link to USENIX ATC video: https://www.youtube.com/watch?v=mstZT431AeQ

Link to Airbnb’s Himeji (based o Zanzibar): https://medium.com/airbnb-engineering/himeji-a-scalable-centralized-system-for-authorization-at-airbnb-341664924574

Introduction

Definitions extracted from the paper:

Summary of Requirements


I have many colleagues who code in either C# or Java. They usually understand the syntaxis of each other’s’ programming language but they don’t know where to start when it comes to build a working web application or API. In other words, knowing the syntax of a programming language is not enough to be productive to build a real application in that language. In this post I will show you how to build a Java API for C# developers. Java developers will benefit from this post since they will be able to see the similarities too.

Installing the Pre-requisites

First…


There is a point in time in between the moment you finish refining your Functional / Non-Functional requirements, Design Drivers / Principles and the moment where you draw the initial partition of your system. Some perceive this moment simply as the time spent drawing that first version of your logical architecture diagram (also called conceptual diagram).

In an ideal world, that first partition implies making major design decisions which cover all requirements. …



This is the second post in the series. To check out the first part visit: https://jacace.wordpress.com/2021/02/01/kafka-101/

You might need to do some cleaning first. To delete a topic run the following command:

The summary from the last session


This post does not cover all the theory behind kafka but it focuses on the important aspects that you need to know to start working and be productive with kafka.

Now I will summarise below the minimal concepts required to start.


[ For a general introduction to to Spark: https://jacace.wordpress.com/2020/12/02/hadoop-spark-for-beginners ]

I wanted to to put together this post thanks to the inspiration I got from here: https://towardsdatascience.com/a-journey-into-big-data-with-apache-spark-part-1-5dfcc2bccdd2

Basically I did 3 improvements based to the original post above:

Enjoy!

Javier Caceres

Originally published at http://jacace.wordpress.com on January 20, 2021.


A definition: “A data pipeline is a logical abstraction representing a sequence of data transformations required for converting raw data into insights”.

General principles


This is a quick tutorial about Rego with some important bullet points summarised below. For more detailed guides please check out the official docs ( https://www.openpolicyagent.org/docs/latest/policy-language/) or this good guide on medium ( https://medium.com/@mathurvarun98/how-to-write-great-rego-policies-dc6117679c9f).

Javier Caceres

Hands-on Sr Software Manager / Architect based in Ireland. Views are my own. Linkedin: https://ie.linkedin.com/in/jacace Twitter: https://twitter.com/jacace

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store