What Is Amazon Managed Streaming for Apache Kafka (AWS MSK)?
Amazon Managed Streaming for Apache Kafka (AWS MSK) is a fully managed service that simplifies the process of running Apache Kafka on AWS. Apache Kafka is a popular open source platform used for building real-time streaming data pipelines and applications.
AWS MSK helps with setting up, scaling, and managing Kafka clusters, allowing developers to focus on their application logic rather than infrastructure. With AWS MSK, you can create and configure Kafka clusters without needing to worry about the underlying hardware, software, or networking details. In addition, AWS handles tasks such as patching, backup, and monitoring, ensuring high availability and security.
Amazon MSK features
Amazon Managed Streaming for Apache Kafka includes the following features:
Amazon MSK Serverless
Amazon MSK Serverless allows you to run Apache Kafka without managing cluster capacity. With MSK Serverless, you do not need to provision or scale the underlying infrastructure, as it automatically adjusts to accommodate the throughput and storage required by the streaming workloads. This simplifies the management of Kafka clusters, particularly in scenarios with unpredictable or highly variable traffic.
MSK Serverless handles all the operational complexities, including resource provisioning, cluster scaling, and load balancing. It ensures your Kafka applications remain highly available and performant, with built-in fault tolerance and automatic failover.
Amazon MSK Connect
Amazon MSK Connect provides a managed service for integrating Apache Kafka with other data sources and sinks using Kafka® Connect.
Kafka Connect is a framework for connecting Kafka with external systems such as databases, file systems, and cloud services. MSK Connect assists with the setup, management, and scaling of connectors, reducing the operational burden on users.
With MSK Connect, you can configure and deploy connectors from a library of pre-built connectors or custom-built ones. It offers built-in monitoring, logging, and error handling, ensuring reliable data integration workflows. MSK Connect also supports scaling connectors up or down based on the data throughput.
Amazon MSK Replicator
Amazon MSK Replicator enables seamless replication of Kafka topics across different Kafka clusters. This feature is useful for disaster recovery, data migration, and creating multi-region architectures. By replicating data between clusters, MSK Replicator ensures that your Kafka data is available and consistent across different environments.
MSK Replicator leverages the MirrorMaker 2.0 tool, which is integrated into the managed service, to handle the replication process. It supports both synchronous and asynchronous replication, providing flexibility based on consistency and latency requirements. The service includes features like offset translation, automatic topic creation, and monitoring.
Related content: Read our guide to Apache Kafka architecture
Amazon MSK pricing
MSK pricing
Amazon MSK charges an hourly rate for each Kafka broker instance. The cost varies depending on the instance size and the number of active brokers in the cluster. Additionally, you are billed for the storage provisioned in your cluster.
The storage cost is calculated by summing the GB provisioned per hour and dividing by the total hours in a month, giving a GB-months value. If you provision extra storage throughput, this is billed based on MB/s provisioned per hour per broker, also divided by the total hours in the month to get MB/s-months.
Data transfer within the cluster (between brokers or metadata nodes and brokers) incurs no additional cost. However, standard AWS data transfer charges apply for data moved in and out of Amazon MSK clusters. If you need private connectivity across VPCs, powered by AWS PrivateLink, you pay an hourly rate for each cluster and authentication scheme enabled, plus a fee per GB of data processed.
Instance Type | vCPU | Memory (GiB) | Price Per Hour |
kafka.t3.small | 2 | 2 | $0.0456 |
kafka.m5.large | 2 | 8 | $0.21 |
kafka.m7g.large | 2 | 8 | $0.204 |
kafka.m5.xlarge | 4 | 16 | $0.42 |
kafka.m7g.xlarge | 4 | 16 | $0.408 |
MSK Serverless Pricing
MSK Serverless pricing is based on cluster hours, partition hours, data written by producers, data read by consumers, and storage consumed.
Pricing Dimension | Unit | Price per unit |
Cluster-hours | per hour | $0.75 |
Partition-hours | per hour | $0.0015 |
Storage | per GiB-month | $0.10 |
Data in | per GiB | $0.10 |
Data 0ut | per GiB | $0.05 |
MSK Connect pricing
MSK Connect is priced per hour based on the number and size of workers (MSK Connect Units or MCUs). MSK Connect units are calculated per hour, billed per second at $0.11.
For example, a connector streaming data to an S3 bucket in US East (N. Virginia), autoscaling between two and four workers each using 1 MCU, results in the following cost: 1,984 hours * $0.11 = $218.24.
MSK Replicator pricing
MSK Replicator charges an hourly rate per replicator and a per-GB data replication fee.
Replicator Pricing in the US East (N. Virginia) Region):
Pricing Dimension | Unit | Price per Unit |
Replicator-hours | per hour | $0.30 |
Data-replicated | per GiB | $0.08 |
Tutorial: Getting started using Amazon MSK
Here’s a step-by-step guide to setting up and using a cluster in Amazon MSK. The tutorial is adapted from the official documentation.
Creating an Amazon MSK cluster
To create a cluster via the AWS Management Console:
- Sign in and open the AWS MSK Console.
- Select Create cluster.
- Choose the settings. By default, Quick create will be selected for the Creation method setting.
- Enter a name for the cluster. It should be descriptive, such as
MSKExampleCluster
. - Select the Provisioned option as the Cluster type under General cluster properties.
- Save the values of the following settings for later use: VPC, Subnets, and Security groups associated with VPC.
- Click Create cluster.
- Monitor the cluster status on the Cluster summary page. The status should change from Creating to Active. Once the status is Active, the cluster is ready for connection.
Creating an IAM role
This step involves creating an IAM policy for topic creation and data sending, followed by creating an IAM role associated with this policy.
To create an IAM policy:
- Open the AWS IAM Console.
- Navigate to Policies and click on Create Policy.
- In the JSON tab, replace the editor window content with the JSON code provided here.
- Select Next: Tags, then Next: Review.
- Name the policy (e.g.,
msk-example-policy
) and click on Create policy.
To create an IAM role and attach the policy:
- Navigate to Roles and select Create role.
- Select EC2 under Common use cases, then choose Next: Permissions.
- Search for the created policy, select it, and proceed with Next: Tags and Next: Review.
- Name the role (e.g.,
msk-example-role
) and click on Create role.
Creating a client machine
To create a client machine:
- Open the AWS EC2 Console.
- Select Launch instances.
- Name the client machine (e.g.,
MSKExampleClient
). - The default AMI type should be Amazon Linux 2 AMI (HVM) – Kernel 5.10, SSD Volume Type. Keep this setting.
- Choose t2.micro as the instance type (this should also be selected by default).
- For Key pair (login), either create a new key pair or use an existing one.
- In the Advanced details box, select the relevant IAM role (
msk-example-role
). - Click on Launch instance and then View instances.
- Copy the security group ID of your new instance and save it.
- Open the AWS VPC Console.
- In the Security Groups section, find and edit the inbound rules of the cluster’s security group to allow traffic from the client machine’s security group.
Creating a topic
To create a topic:
- Connect to the client machine via the Amazon EC2 console.
- Install Java:
1sudo yum -y install java-11
- Download Apache Kafka, specifying your MSK cluster’s version:
1wget https://archive.apache.org/dist/kafka/{MSK VERSION}/kafka_2.13-{MSK VERSION}.tgz
- Extract the downloaded file:
1tar -xzf kafka_2.13-{MSK VERSION}.tgz
- Download the Amazon MSK IAM JAR file in the libs directory:
1wget https://github.com/aws/aws-msk-iam-auth/releases/download/v1.1.1/aws-msk-iam-auth-1.1.1-all.jar
- Create a client.properties file in the bin directory with the following content:
1234security.protocol=SASL_SSLsasl.mechanism=AWS_MSK_IAMsasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
- Obtain a broker endpoint from the Amazon MSK console.
- Create a topic using the following command:
1/bin/kafka-topics.sh --create --bootstrap-server BootstrapServerString --command-config client.properties --replication-factor 3 --partitions 1 --topic MSKTutorialTopic
Producing and consuming data
- To produce and consume messages with MSK:
Start a console producer:1/bin/kafka-console-producer.sh --broker-list BootstrapServerString --producer.config client.properties --topic MSKExampleTopic - Enter messages and click Enter. Each line is sent to the Kafka cluster as a separate message.
- Open a second connection to the client machine.
- Start a console consumer:
1/bin/kafka-console-consumer.sh --bootstrap-server BootstrapServerString --consumer.config client.properties --topic MSKExampleTopic --from-beginning
- See the consumer window display the messages entered in the producer window.
Using CloudWatch to view MSK metrics
To view Amazon MSK metrics in Amazon CloudWatch:
- Go to the CloudWatch Console.
- Navigate to Metrics, then choose AWS/Kafka.
- View broker-level metrics by selecting Broker ID, Cluster Name or cluster-level metrics by selecting Cluster Name.
- Optionally, create a CloudWatch alarm based on specific statistics and time periods.
Amazon MSK best practices
Here are some useful practices for working with MSK.
Right-size your cluster
When deploying Amazon MSK, it’s crucial to correctly size the cluster based on workload requirements. Over-provisioning leads to unnecessary costs, while under-provisioning can result in performance bottlenecks. Start by assessing the expected throughput, storage needs, and number of partitions.
Use AWS’s auto-scaling features to adjust the cluster size based on real-time data, ensuring optimal performance and cost efficiency. To determine the right size, analyze the data ingestion rates, the number of consumers, and the volume of data being processed.
It’s also important to consider the future growth of your data streams and the scalability of the architecture. Regularly review and adjust the cluster configuration as the application scales or as usage patterns change. Conduct performance testing under different load conditions to fine-tune the cluster settings and make informed decisions about resource allocation.
Use the latest Kafka AdminClient
Using the latest version of the Kafka AdminClient helps prevent issues such as topic ID mismatches, which can occur due to inconsistencies between the Kafka client and broker versions. Keeping the AdminClient updated ensures compatibility with the latest Kafka features and bug fixes, leading to more reliable topic management.
Regularly check for updates and review release notes to incorporate new features and improvements that enhance cluster stability and performance. This practice helps avoid technical glitches and leverages the latest improvements and optimizations introduced by the Kafka community.
Build highly available clusters
To achieve high availability, configure the MSK clusters across multiple Availability Zones (AZs). This setup provides redundancy, ensuring that Kafka services remain operational even if one AZ experiences an outage. Implement replication across these zones to protect your data, and use fault-tolerant configurations to automatically recover from hardware failures.
By distributing Kafka brokers across different AZs, you minimize the risk of service disruption. Additionally, ensure that your disaster recovery plans include strategies for quickly restoring service in the event of significant outages. Implementing automated failover mechanisms and regular testing of these processes can further increase the cluster’s resilience.
Monitor broker CPU usage
Monitoring the CPU usage of Kafka brokers is essential for maintaining performance and preventing resource contention. High CPU usage can indicate that the brokers are overloaded, which might lead to increased latency and message processing delays. Use AWS CloudWatch to track CPU metrics and set up alarms to notify you of any anomalies.
In addition to setting up alarms, perform regular audits of CPU usage patterns to identify trends and potential bottlenecks. If you consistently observe high CPU utilization, consider upgrading your instance types or adding more brokers to the cluster. Optimizing the Kafka configuration, such as adjusting the number of threads or batch sizes, can also help balance the load.
Adjust the data retention parameters
Properly configuring data retention parameters aids in balancing storage costs and data availability. Amazon MSK allows you to set retention policies based on your application’s needs. Define retention periods for topics to manage how long data is kept before being deleted. This helps in controlling storage utilization and ensures that only relevant data is retained.
Reviewing the data retention settings periodically is crucial as data retention needs can change over time. For example, compliance requirements may dictate longer retention periods, while cost management may push for shorter retention times. Evaluate the impact of different retention settings on your storage costs and adjust accordingly.
Monitor Apache Kafka memory
Insufficient memory can cause increased garbage collection pauses, leading to higher latencies. Use AWS CloudWatch to monitor memory usage metrics and set thresholds to detect potential issues early. Ensure that the Kafka brokers have adequate heap size allocated, and periodically review garbage collection logs to identify and address any memory-related performance bottlenecks.
Understanding the memory usage patterns of Kafka brokers is essential for optimizing performance. Implementing best practices for JVM tuning and garbage collection configuration can reduce latency and improve throughput.
Regularly analyze memory usage trends and make adjustments to the heap size and other memory settings to ensure the Kafka brokers operate efficiently. Additionally, consider using tools like JConsole or VisualVM for more detailed memory analysis and troubleshooting.
NetApp Instaclustr managed Apache Kafka vs AWS MSK: A comprehensive comparison
NetApp Instaclustr and AWS MSK (Managed Streaming for Apache Kafka) have emerged as two prominent players. Both platforms offer robust solutions for managing Kafka clusters, but they differ in several key aspects. Instaclustr possesses several key advantages over AWS MSK including:
- Instaclustr covers the entire Kafka ecosystem. It offers a fully managed service for Apache Kafka, along with complementary services such as Apache ZooKeeper, Kafka Connect, and Kafka Streams. This holistic approach simplifies the management and deployment of Kafka clusters, allowing businesses to focus on their core applications rather than the underlying infrastructure.
- Unlike AWS MSK–which is limited to the Amazon Web Services (AWS) ecosystem– Instaclustr supports multi-cloud deployments. It enables businesses to deploy Kafka clusters on popular cloud providers like AWS, Google Cloud Platform (GCP), and Microsoft Azure, providing flexibility and avoiding vendor lock-in. This multi-cloud support empowers organizations to choose the cloud provider that best suits their requirements while maintaining consistent Kafka management across different environments.
- Robust security features such as encryption at rest and in transit, fine-grained access control, and integration with external authentication providers like LDAP and Active Directory are readily available on Instaclustr. Additionally, Instaclustr helps businesses meet regulatory requirements by providing compliance certifications such as SOC 2 Type II and ISO 27001. These security measures ensure that sensitive data processed through Kafka clusters remains protected and meets industry standards.
- Instaclustr’s managed Kafka service is designed for high availability and disaster recovery. It leverages multi-region deployments and automatic failover mechanisms to ensure uninterrupted Kafka operations even in the event of infrastructure failures. Instaclustr’s platform also offers robust backup and restore capabilities, allowing businesses to recover data quickly and efficiently. These features provide peace of mind to organizations that rely on Kafka for critical data processing and streaming applications.
- Instaclustr differentiates itself by providing exceptional support and professional services. Their team of Kafka experts offers 24/7 support, ensuring prompt assistance in case of any issues or questions. Instaclustr also provides professional services for architecture design, performance optimization, and migration assistance, enabling businesses to leverage their expertise and experience to achieve optimal Kafka deployments.
- Instaclustr’s pricing model is transparent and cost-effective. It offers predictable pricing based on the resources consumed, without any hidden charges or complex pricing tiers. This allows businesses to plan their budgets effectively and avoid unexpected cost escalations. Additionally, Instaclustr’s multi-cloud support enables organizations to leverage competitive pricing from different cloud providers, further optimizing costs.
While AWS MSK is a popular choice for managed Kafka services, Instaclustr offers several distinct advantages that make it a compelling alternative. With its comprehensive managed Kafka solution, multi-cloud support, enhanced security and compliance measures, high availability and disaster recovery capabilities, expert support, and cost-effective pricing model, Instaclustr is a reliable and scalable platform for businesses seeking a robust Kafka service.
By carefully evaluating their specific requirements and considering these advantages, organizations can make an informed decision when choosing between Instaclustr and AWS MSK.
Learn more about NetApp Instaclustr for Apache Kafka and discovered how our fully managed service for Apache Kafka®️ can enable you to stop worrying about your data infrastructure and instead focus on innovating throughout the rest of your application stack.