-
- Dev Rel
- Technical
Why Is Apache Kafka® Tiered Storage More Like a Dam Than a Fountain? Part 2
In Part 1 of this blog, we introduced Apache Kafka® Tiered Storage and had an initial look at how it works in practice on a public preview version of Instaclustr’s managed Apache Kafka service. In this part, we will have a closer look at Performance, dam the river, and conclude. Kafka Tiered Storage Performance Mistaya...
Learn MorePaul BrebnerSeptember 18, 2024 -
- Dev Rel
- Technical
Why is Apache Kafka® Tiered Storage More Like a Dam than a Fountain? Part 1
Instaclustr recently announced the availability of a public preview of Apache Kafka® tiered storage. This is a long-awaited “feature” (more a fundamental architectural shift really), proposed in KIP-405 in 2019, which finally appeared in Kafka version 3.6.0 late last year (KAFKA-7739). Tiered storage is still “early access” so is not currently recommended for use in...
Learn MorePaul BrebnerAugust 27, 2024 -
- Dev Rel
Get Your Apache Camel™ Kafka Connectors in a Row
A long camel train on the beach in Broome, Western Australia (Source: Adobe Stock by scottimage) Recently I wrote a blog called “Running the Apache Camel™ HTTP Kafka Source Connector on Instaclustr Managed Apache Kafka®” (check it out here). There are a lot of Apache Camel Kafka Connectors available, and I wondered how to “get your Camels...
Learn MorePaul BrebnerMay 28, 2024 -
- Dev Rel
- Technical
Running the Apache Camel™ HTTP Kafka Source Connector on Instaclustr Managed Apache Kafka®
Camels drinking from a leaking pipe (Source: Wikimedia) A few years ago I built a “zero-code” data processing pipeline with Apache Kafka® and related open source technologies like OpenSearch®, PostgreSQL®, and Apache Superset™. Here’s the first blog in that series (there were 10 in all; the links to each can be found at the bottom...
Learn MorePaul BrebnerApril 08, 2024 -
- Dev Rel
Tracing Apache Kafka® With OpenTelemetry – “Leave Lots of Traces”
A radio-controlled saltwater crocodile? No, a croc with telemetry – sensors + radio communication! (Source: Wikimedia) 1. Introduction It’s been 5 years since I looked at observability and distributed tracing for Apache Kafka® with open source OpenTracing and Jaeger in this blog. So, I decided it was worth taking another look at tracing Apache Kafka, particularly...
Learn MorePaul BrebnerMarch 15, 2024 -
- Dev Rel
- Technical
An Apache Kafka® and RisingWave Stream Processing Christmas Special
Something has gone amiss in Santa’s workshop! The “toy” Elves and the “box” Elves have stopped talking to each other and have set up 2 separate conveyor belts. The packing Elves therefore need some help to match toys and boxes. In this Christmas Special blog let's give Santa a hand and build a Kafka Elf Packing Assistance Machine. We'll use Apache Kafka combined with Apache Streams or a new open-source Streams processing technology, RisingWave.
Learn MorePaul BrebnerDecember 14, 2023 -
- Technical
Instaclustr Cluster Performance Insights #1: Cluster Size Distribution and Zipf’s Law: Why Open Source Computer Clusters Are Like Galaxies
Image from NASA’s James Webb Space Telescope showing thousands of galaxy clusters (https://webbtelescope.org/contents/media/images/2022/038/01G7JGTH21B5GN9VCYAHBXKSD1?news=true) public domain Recently I decided to take a closer look at the cloud computing clusters that Instaclustr uses to provide managed open source Big Data services to 100s of customers, to see if I could discover any interesting resource and performance/scalability insights....
Learn MorePaul BrebnerDecember 12, 2023 -
- Dev Rel
Apache Kafka® Anti-Patterns and How To Avoid Them
USS Enterprise NCC-1701 Warp Drive (Source: Bryan Alexander, CC BY 2.0, via Wikimedia Commons) As every “Trekky” (a fan of Star Trek) knows, in the Star Trek universe spaceships can travel at speeds faster than light using warp engines fuelled by antimatter. It turns out that antimatter is real enough in our universe, but very...
Learn MorePaul BrebnerNovember 08, 2023 -
- Dev Rel
- Technical
Machine Learning Over Streaming Kafka Data—Part 6: Incremental TensorFlow Training With Kafka Data and Concept Drift
In the “Machine Learning over Streaming Kafka Data” blog series we’ve been learning all about Kafka Machine Learning—incrementally! In the previous part, we connected TensorFlow to Kafka and explored how incremental learning works in practice with moving data. In this part, we introduce concept drift, try and reduce noise, and remove time! 1. Concept Drift ...
Learn MorePaul BrebnerOctober 17, 2023