Apache Kafka® broker: Key components, tutorial, and best practices

What is a Kafka broker?

A Kafka broker is a core component of Apache Kafka, responsible for receiving, storing, and forwarding data within Kafka clusters. It manages data replication, partitioning, and distribution to ensure data flow across distributed systems. Each broker in a Kafka cluster operates independently, handling different segments of data and providing scalability and reliability.

Kafka brokers work together within a cluster, which consists of multiple brokers to handle large datasets efficiently. A broker can handle thousands of partitions and millions of messages per second, making Kafka an appropriate choice for high-throughput data pipelines. Brokers are fault-tolerant, ensuring the system is resilient to node failures.

Key components of Apache Kafka broker

Topics and partitions

Kafka organizes messages using topics, which are categories to which records are published. A topic is subdivided into partitions, which enable Kafka to divide load and process messages concurrently. This division allows high-throughput data processing and is central to Kafka’s scalable architecture. Partitions make Kafka highly distributive, enabling data parallelism and redundancy.

Partitions are crucial for Kafka’s horizontal scaling, allowing multiple brokers to handle numerous partitions of a single topic. Each partition is an ordered sequence of records, ensuring eventual data integrity. This structure allows consumers to read messages consistently, adhering to their sequential order within a partition.

Leaders and replicas

In each partition, one broker is elected as the leader while others act as replicas. The leader performs all read and write operations for the partition. To ensure high availability, Kafka employs a replication mechanism where the leader’s partitions are duplicated across other nodes. This ensures that if the leader fails, one of the replicas can take over.

Replication is essential for fault tolerance in Kafka, enabling automatic failover. Each partition’s followers replicate the leader’s data. If a leader broker fails, one of the followers automatically becomes the new leader while maintaining the latest data state. This system ensures minimal disruption to the Kafka system.

Offset management

Offsets in Kafka serve as pointers that indicate the position of the last consumed message by a consumer group. This mechanism lets consumers resume processing from the last known state after interruptions. Offsets are stored either in the Kafka broker or in separate storage, allowing multiple consumer groups to process messages independently without conflicts.

Accurate offset management ensures data consistency and correctness when consuming messages. By keeping track of offsets, Kafka can deliver the same message multiple times or only once, according to the consumer’s requirements. It allows flexibility in message processing and resending, useful for application scenarios requiring different data consumption patterns.

Related content: Read our guide to Kafka architecture

Quick tutorial: How to start a Kafka broker

To start a Kafka broker, you need to use the kafka-server-start.sh script, which initializes the Kafka server based on a configuration file. Before starting the Kafka broker, ensure that Zookeeper, a prerequisite for Kafka, is running.

Steps to start the Kafka broker:

1. Start Apache ZooKeeper

Use the following command to start ZooKeeper:

./bin/zookeeper-server-start.sh config/zookeeper.properties

1	./bin/zookeeper-server-start.sh config/zookeeper.properties

Verify that ZooKeeper is up and running before proceeding.

2. Start Kafka broker

Once ZooKeeper is operational, start the Kafka broker using the following command:

./bin/kafka-server-start.sh config/server.properties

1	./bin/kafka-server-start.sh config/server.properties

This command initiates the broker with settings defined in the server.properties file. Ensure that the configuration file is correctly set up before running this command.

3. Logging configuration

The kafka-server-start.sh script utilizes the config/log4j.properties file for logging configurations. You can override the default logging settings by setting the KAFKA_LOG4J_OPTS environment variable. For example:

KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:config/log4j.properties"

1	KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:config/log4j.properties"

4. Environment variables

The kafka-server-start.sh script also supports other environment variables for additional customization:

KAFKA_HEAP_OPTS: Defines heap memory settings for the Kafka broker.
EXTRA_ARGS: Allows you to pass additional arguments during the startup process.

5. Command-line options

The script accepts several command-line options for flexibility:

--override property=value: Overrides property values specified in the server.properties file.
-daemon: Runs the Kafka broker in the background as a daemon.

By following these steps and properly configuring the startup options, you can successfully initialize a Kafka broker, ensuring that it integrates with the rest of the Kafka cluster.

Tips from the expert

Jack Walker

Senior Software Engineer

Jack Walker is a Senior Software engineer specializing in open source and Apache Kafka.

In my experience, here are tips that can help you better understand and optimize Kafka brokers:

Use rack-awareness for broker placement: Configure broker.rack settings to improve fault tolerance by distributing replicas across different racks or availability zones. This reduces the risk of losing all replicas due to a single rack or zone failure.
Enable TLS encryption and authentication: Secure Kafka broker communication by enabling TLS encryption (ssl.enabled.protocols) and SASL authentication. This prevents unauthorized access and protects sensitive data from eavesdropping.
Optimize JVM settings for brokers: Fine-tune JVM options like -Xms and -Xmx for heap size and garbage collection strategies (e.g., G1GC or ZGC) to improve broker performance and reduce latency during high throughput.
Leverage tiered storage for archival: Use tiered storage solutions (e.g., Kafka Tiered Storage or cloud-based storage) for older data to reduce disk space usage on brokers. This is especially helpful for retaining large datasets cost-effectively.
Regularly rebalance partitions: Use tools like Cruise Control or the kafka-reassign-partitions.sh script periodically to rebalance partitions and avoid hotspots caused by uneven partition distribution or broker failures.

Best practices to optimize Kafka brokers

Here are some important practices to consider when working with Kafka brokers.

1. Optimize broker configuration

Configuring Kafka brokers correctly is critical to ensure stability and performance. Key areas to focus on include:

Log retention and cleanup policies: Kafka stores messages on disk based on retention policies. Adjust these settings to align with data retention and disk space needs. log.retention.hours defines how long Kafka retains data before deletion. Set this based on the data lifecycle requirements. log.retention.bytes specifies the maximum size of logs per partition. When the limit is reached, Kafka deletes older messages. log.segment.bytes controls the size of individual log segments. Smaller segments reduce memory usage during compaction but may increase overhead.
Replication settings: min.insync.replicas determines the minimum number of replicas that must acknowledge a write for it to be successful. Set this to at least 2 for high availability. Use unclean.leader.election.enable. When enabled, it allows out-of-sync replicas to be elected as leaders, which can lead to data loss. Set this to false in production environments.
Network and I/O tuning: Brokers handle large volumes of network traffic and disk I/O. Optimize these configurations to prevent bottlenecks. socket.send.buffer.bytes and socket.receive.buffer.bytes adjust buffer sizes to match the network bandwidth. Use performance testing to find the ideal values. num.io.threads and num.network.threads increase these settings based on the number of disks and network interfaces on the broker hardware.

2. Monitor broker health and resource usage

Monitoring ensures that the Kafka cluster operates efficiently and helps teams proactively address issues. Use tools like Prometheus, Grafana, or Kafka’s JMX metrics to gather insights into broker performance. Key metrics to track:

Broker throughput: Measure the rate of incoming and outgoing messages per second (MessagesInPerSec and MessagesOutPerSec). A drop in these metrics could indicate a bottleneck or consumer lag.
Disk and network utilization: Track disk usage (LogEndOffset and LogStartOffset) to prevent brokers from running out of space. Use tools like du or df to check partition sizes. Monitor network traffic to ensure brokers are not saturating available bandwidth.
Consumer lag: Monitor consumer group offsets using Kafka’s ConsumerLag metric or tools like Kafka Lag Exporter. Persistent lag indicates issues with consumer performance or uneven partition distribution.
ZooKeeper health: Since Kafka relies on ZooKeeper, monitor ZooKeeper’s health (NumAliveConnections and SyncConnected) to ensure brokers can connect reliably.

3. Tune performance settings

Performance tuning helps Kafka handle large-scale workloads efficiently.

Compression: Enable message compression at the producer level using compression.type. Options like snappy or lz4 reduce network traffic but require CPU resources for compression and decompression.
Batching: Producers can improve throughput by batching multiple records into a single request. Use batch.size to increase the batch size and allow producers to group more messages per request. Use linger.ms to introduce a small delay and allow more messages to accumulate in a batch before being sent.
Partition distribution: Ensure partitions are evenly distributed across brokers. Uneven distribution can overload some brokers, leading to performance degradation. Use the kafka-reassign-partitions.sh tool to rebalance partitions manually if needed.

4. Implement fault tolerance

Fault tolerance ensures data reliability and system resilience during failures:

Replication factor: Set the replication factor for each topic to at least 3 in production environments. This ensures that at least two replicas remain available even if a broker fails.
Leader rebalancing: Use Kafka’s leader rebalancing tools to distribute partition leadership evenly across brokers. Uneven leadership can result in some brokers handling more load than others.
Automated recovery: Enable automated recovery mechanisms to minimize downtime. auto.leader.rebalance.enable automatically rebalances partition leadership when brokers join or leave the cluster. leader.imbalance.check.interval.seconds controls how often Kafka checks for leader imbalance. Lower values improve responsiveness but may increase overhead.
Quotas and limits: Configure quotas for producers and consumers to prevent a single client from overwhelming the cluster. Use quota.producer.bytes.per.second and quota.consumer.bytes.per.second settings for this purpose.

5. Scale brokers appropriately

Scaling the Kafka cluster ensures it can handle growing workloads while maintaining performance:

Add brokers to the cluster: When adding brokers, use Kafka’s partition reassignment tool (kafka-reassign-partitions.sh) to redistribute partitions across the new broker. Plan for downtime or reduced performance during rebalancing.
Horizontal scaling: Increase the number of partitions for topics to distribute the workload. While scaling, consider that too many partitions can increase overhead on brokers and consumers. Ensure that consumer group concurrency matches the total number of partitions for efficient message processing.
Capacity planning: Regularly review metrics such as message throughput, disk usage, and partition count to anticipate future scaling needs. Plan expansions before the cluster reaches critical thresholds.

What is Instaclustr for Apache Kafka?

When managing high-performance streaming data, Apache Kafka is often the go-to solution for businesses worldwide. However, handling the intricacies of Kafka can be challenging, especially when it comes to maintaining optimal performance, ensuring scalability, and managing Kafka Brokers.

That’s where Instaclustr for Apache Kafka comes in: a fully managed service that simplifies deploying, running, and managing Kafka clusters while allowing organizations to focus on creating value from their data.

Understanding Kafka brokers and their role

At the heart of any Kafka architecture lies the Kafka Broker—a critical component responsible for storing, processing, and distributing messages to consumers. Brokers form the backbone of a Kafka cluster by ensuring messages are partitioned and replicated across the system, enabling fault tolerance, high availability, and scalability. For businesses, properly managing Kafka Brokers is key to achieving seamless data streaming that meets real-world demands.

How Instaclustr optimizes Kafka brokers

Instaclustr’s fully managed solution ensures Kafka Brokers are configured, monitored, and maintained to achieve peak performance. Here’s how Instaclustr optimizes Kafka brokers:

Seamless scaling: Instaclustr simplifies horizontal scaling, allowing the effortless addition of Kafka Brokers to clusters as business grows.
Proactive management: Instaclustr’s expert team handles monitoring and incident management, ensuring brokers consistently perform under heavy workloads.
Optimized performance: By fine-tuning key broker configurations and balancing loads across partitions, Instaclustr ensures optimal throughput with low latency.
High reliability: Built-in redundancy through replication ensures that data is protected, even if a Kafka Broker experiences downtime.
Security at scale: Instaclustr integrates advanced security features like encryption, authentication, and access control to safeguard brokers and data.

Why choose Instaclustr for Apache Kafka?

Instead of spending countless hours managing Kafka Brokers and troubleshooting cluster issues, Instaclustr enables organizations to put their energy into unlocking the value of real-time data. Apache Kafka on the Instaclustr Managed Platform takes the complexity out of running Kafka at scale and gives businesses the confidence to rely on their data infrastructure, no matter how fast it grows.

From automatic updates and 24×7 support to detailed insights into broker health and performance, Instaclustr empowers organizations to channel their resources into innovation, leaving the operational overhead to the experts.

For more information: