Apache Kafka Tutorial

  1. What is Apache Kafka?

  2. Apache Kafka is an open source platform and distributed messaging system with publish and subscribe events. It is a mediator between the producer and consumer systems. The producer send data to the Kafka, where consumer consumes the data from Kafka.

    Basically Apache Kafka is a Messaging system to pass messages between applications, the messaging sytem means its simple exchanging of messages between two or more sources, end-points, servers etc. It can handle high volume of data and enables us to send messages from one source to another source.

    Kafka was originally developed by LinkedIn and later it was donated to the Apache Software Foundation.

    Benefits of Apache Kafka

    1. Reliability and Durability
    2. Kafka stores data in a fault-tolerant, distributed manner, with multiple copies of each message stored across multiple servers. This ensures that your data is always available, even in the event of hardware failure.
    3. Scalability
    4. Kafka is designed to be highly scalable, it can handle a large volume of data and can grow as your need. Kafka is also highly available and fault-tolerant, so you can rely on it to handle your data processing needs even in the event of hardware failure.
    5. Performance
    6. Kafka is designed to handle large volumes of data with high throughput and low latency. This is the great solution for applications that require real-time data processing. Kafka can handle millions of events per second and has been shown to achieve sub-millisecond latency.
    7. Real-time processing
    8. Kafka enables real-time data processing, it can process data as it arrives.

  3. What is messaging system?

  4. Messaging system is the process of exchanging messages between two or more sources, end-points, servers etc. There are two processes of messaging system.
    1. Sender or Producer
    2. The responsibility of sender is to send/write the messages. The sender is also known as Producer who publishes the messages.
    3. Receiver or Consumer
    4. The responsibility of receiver is to read that messages. The receiver is also known as Consumer who consumes the messages.
  5. What is Streaming In The Messaging system?

  6. The Streaming process allows parallel execution of the data, where one record executes without waiting for the output of the previous record. In a distributed system normally we see the streaming process and parallel execution to simplify the task and improve performance. Its executes the thread/data without waiting for another.

In the next topic, you will understand about Kafka Cluster