Best Practices, Bigdata, Hadoop, Kafka

Better late then never : Time to replace your micro-service architecture with Kafka…

Kafka already spawns and facilitated many organizations on micro-services architecture world. If Kafka is still not part of your infrastructure, its high time for you to go with it. I am not promoting Kafka better then any other message queue systems as many articles are already floating on the internet about this subject. Kafka’s uniqueness is that it provides both simple file system and bridge functions. A Kafka broker’s most basic task is to write messages to and read messages from the log on disk as quickly as possible. Queue message will not be lost after the persistence, which is […]

Analytics, Apache Spark, Bigdata, Kafka, Messaging System

In-depth Kafka Message queue principles of high-reliability

At present many open source distributed processing systems such as Cloudera, Apache Storm, Spark and others support the integration with Kafka. Kafka is increasingly being favored by many internet shops and they use Kafka as one of its core messaging engines. The reliability of the Kafka message can be imagined as a commercial-grade messaging middleware solution. In this article, we will understand Kakfa storage mechanism, replication principle, synchronization principle, and durability assurance to analyze its reliability. As shown in the figure above, a typical Kafka architecture includes several Producers (which can be server logs, business data, page views generated by […]

Analytics, Apache Spark, Hadoop, Kafka, Python, Spark

Consume JSON Messages From Kafka Using Kafka-Python’s Deserializer

Hope you are here when you want to take a ride on Python and Apache Kafka. Kafka-Python is most popular python library for Python. For documentation on this library visit to page kafka-python is designed to function much like the official java client. kafka-python is best used with newer brokers (0.9+), but is backwards-compatible with older versions (to 0.8.0). Some features will only be enabled on newer brokers. So instead of showing you a simple example to run Kafka Producer and Consumer separately, I’ll show the JSON serializer and deserializer. Preparing the Environment Lets start with Install python package using […]

Analytics, Bigdata, Kafka

Moving to communication of events between subsystems — CQRS-ES with open source…

Before going into definitions of EP, CEP, and QSQS let us start with some basic database term and what problem we are trying to address here. We have commercial databases and database professionals those who publicized CRUD operations a lot. It is one-row-per-pattern works well in most of the projects and enough to build an application more quickly and securely. I have probably implemented 100 CRUD projects (including web applications) and we do that way because we have limited budgets and projects have deadlines. CRUD work well until someone asked for historical data and I saw few managers complaining lack […]

Hadoop, Kafka

Kafka: A detail introduction

I’ll cover Kafka in detail with introduction to programmability and will try to cover almost full architecture of it. So here it go:- We need Kafka when there is need for building a real-time processing system as Kafka is a high-performance publisher-subscriber-based messaging system with highly scalable properties. Traditional systems unable to process this large data and mainly for offline used analysis, Kafka is a solution to the real-time problems of any software solution; that is to say, unify offline or online data processing and routing it to multiple consumers quickly. Below are the Characteristics of Kafka:- Persistent messaging: – […]