-
- Dev Rel
- Technical
Machine Learning Over Streaming Kafka® Data—Part 2: Introduction to Batch Training and TensorFlow
As I mentioned in Part 1 of this series, we are looking at Machine Learning (ML) over streaming Apache Kafka® data. But rather than just jumping in—and immediately going over a fast-flowing waterfall (in a barrel, which people have actually attempted!)—I first need to get a good understanding of TensorFlow with some “still” (static and unchanging) data and batch learning to start with. This will be easier and repeatable before we encounter Kafka and streaming and changing data.
Learn MorePaul BrebnerJuly 26, 2023 -
- Dev Rel
- Technical
Machine Learning Over Streaming Apache Kafka® Data Part 1: Introduction
1. Introduction (Source: Shutterstock) Viewing online cat media is linked with procrastination (Emotion regulation, procrastination, and watching cat videos online: Who watches Internet cats, why, and to what effect?)—this blog almost ended here! Recently I came across 2 use cases of real-time Kafka Machine Learning (ML). Have you ever wondered why TikTok is so...
Learn MorePaul BrebnerJuly 12, 2023 -
- Dev Rel
- Popular
Improving Apache Kafka® Performance and Scalability With the Parallel Consumer: Part 2
In the second part of Improving Apache Kafka® Performance and Scalability With the Parallel Consumer, we continue our investigations with, a trace of a “slow consumer” example, how to achieve 1 million TPS in theory, some experimental results, what else do we know about the Kafka Parallel Consumer, and finally, if you should use it in production.
Learn MorePaul BrebnerMay 04, 2023 -
- Dev Rel
- Popular
Improving Apache Kafka® Performance and Scalability With the Parallel Consumer: Part 1
Apache Kafka® is a high-throughput, low-latency distributed streaming platform. It enables messages to be sent from multiple distributed producers via the distributed Kafka cluster and topics, to multiple distributed consumers. Here’s a photo I took in Berlin of a very old machine that has a similar architecture; I’ll reveal what it does later.
Learn MorePaul BrebnerApril 20, 2023 -
- Dev Rel
- Technical
Exploring Karapace—the Open Source Schema Registry for Apache Kafka®: Part 6—Forward, Transitive, and Full Schema Compatibility
This is part 6 of Exploring Karapace, how does Apache Kafka's schema registry allow backward, forwards and transitive compatibility?
Learn MorePaul BrebnerMarch 23, 2023 -
- Dev Rel
- Technical
Exploring Karapace—the Open Source Schema Registry for Apache Kafka®: Part 5—Schema Evolution and Backward Compatibility
So what happens when the unchangeable forms (schemas) meet the inevitability of change? Let’s dip our toes in the water and find out.
Learn MorePaul BrebnerMarch 10, 2023 -
- Dev Rel
- Technical
Exploring Karapace—the Open Source Schema Registry for Apache Kafka®: Part 4—Auto Register Schemas
In the previous blog, we demonstrated that the process of sending messages via Avro and Karapace from Kafka producers to consumers works seamlessly, although what exactly is going on under Karapace’s exoskeleton is perhaps a bit opaque (e.g. the communication between producers and consumers and Karapace isn’t visible at this level of the code, and the way that the record value data is actually serialized and deserialized also isn’t obvious), but it just works so far which is a good start. Let’s see what happens if we now try and introduce some exception conditions, as this may help us understand “Kafka Crabs” auto settings.
Learn MorePaul BrebnerFebruary 22, 2023 -
- Dev Rel
- Technical
Exploring Karapace—the Open Source Schema Registry for Apache Kafka®: Part 3—Introduction, Kafka Avro Java Producer and Consumer Example
1. Introducing Karapace As we saw in Part 1 and Part 2 of this blog series, if you want to use a schema-based data serialization/deserialization approach such as Apache Avro, both the sender and receiver of the data need to have access to the schema that was used to serialize the data. This could work...
Learn MorePaul BrebnerFebruary 09, 2023