• Apache Kafka
  • Dev Rel
  • Technical
Exploring Karapace—the Open Source Schema Registry for Apache Kafka®: Part 4—Auto Register Schemas

In the previous blog, we demonstrated that the process of sending messages via Avro and Karapace from Kafka producers to consumers works seamlessly, although what exactly is going on under Karapace’s exoskeleton is perhaps a bit opaque (e.g. the communication between producers and consumers and Karapace isn’t visible at this level of the code, and the way that the record value data is actually serialized and deserialized also isn’t obvious), but it just works so far which is a good start. Let’s see what happens if we now try and introduce some exception conditions, as this may help us understand “Kafka Crabs” auto settings.

Why did the Crab cross the road? To get to his car—an “auto” Crab!(Source: Shutterstock)

Experiment 1: Auto Register Schemas False, No Schema Registered

When using Kafka in conjunction with the Karapace Schema Registry, Schemas are automatically registered by default. However, this is a configuration option in the producers, not in Karapace itself. What happens if we try the producer from the previous blog with a new topic, avrotest2, and we now set Auto Register Schemas to false?

This time, the producer throws an exception:

Caused by:

What’s going on? When configured to use Karapace, the producer cannot send a record to a topic that doesn’t yet have a Schema registered for it in Karapace, and it’s unable to create one automatically from the schema supplied by the producer when Auto Register Schemas is false, so the send fails.

Experiment 2: Auto Register Schemas False, Schema Registered

Now let’s repeat the experiment, but after a schema has already been registered for the subject (so we’ll just reuse the original topic again, which has a Schema registered). As expected the producer is able to send a second record which the consumer can read using the registered schema. So this proves that once a Schema is registered you can turn auto register schemas to false and everything still continues to work correctly—obviously, the Schema is being reused by producers and consumers correctly.  

Experiment 3: Auto Register Schemas False, Different Schema

And what happens if the producer tries to send a record with a different Schema, let’s say a String?  (Auto Register Schemas is still false):

Caused by:

Obviously the String Schema type isn’t registered for the topic, although the error message isn’t very clear if it failed because it’s not registered for the subject specifically or due to something else (perhaps incompatibility—see next experiment). But obviously we were trying to do something stupid by changing the Schema of the producer with the incorrect expectation that the consumer would still be able to deserialize it into a PlatonicSolid object—so the exception was useful information. 

 

Some exceptions are welcome

Some exceptions are welcome

(Source: Shutterstock)

Experiment 4: Auto Register Schemas True, Different Schema—Incompatible!

Now let’s try this again, but after changing Auto Register Schemas back to true, so that it is permitted to try and register the Schema. This time we get a different error:

Caused by:

The producer failed because it tried to create a new Schema for the subject, but failed to register it as it is incompatible with the existing schema (the exceptions are a bit spare, and don’t include the subject name or the existing Schema details, unfortunately). Platonic Solids and Strings are pretty obviously incompatible, but what really is going on with the compatibility check? What’s the BACKWARD compatibility mode and what other options are there? When are Schemas compatible or incompatible?  

“You are incompatible! You will be deleted” (Cyberman from Dr Who)


“You are incompatible! You will be deleted” (Cyberman from Dr Who)

https://commons.wikimedia.org/wiki/File:Cybermen_%282659813336%29.jpg

Experiment 5: Karapace Bypassed, Different Schema—Success (and Failure)!

Just for fun, I tried one final experiment to try and succeed in sending a String record to the same topic. But how can you do this if the Schemas are not compatible? Simple, just bypass the Schema Registry—this turns out to be trivial, as all you have to do is remove the Karapace URL from the producer properties. There’s then nothing to force the use of the Schema registry, just convention (or manners). This producer has “bad” manners.

The producer successfully sends the String record, but on the consumer side something inevitably goes wrong:

Exception in thread “main”

Caused by: org.apache.kafka.common.errors.SerializationException:  Unknown magic byte!

Unknown Magic Byte!

Unknown Magic Byte!

(Source: Shutterstock)

Unsurprisingly the consumer (which is still expecting an Avro serialized record value based on the current registered Schema) fails to deserialize the String record value into a Platonic Solid—the magic byte is unknown! Basically the consumer has correctly recognized that it’s not valid Avro serialized data and has given up trying to make sense of it.

Digging deeper into the Avro Schema wire format I found that: byte 0 is the Magic Byte (with a value of 0), bytes 1-4 are the Schema ID, and bytes 5 and onwards are the Avro serialized binary data (Confirmed by checking the code here).  So part of the mystery of how Kafka producers and consumers coordinate Schemas via Karapace is solved—the producers apparently send a registered Schema ID, and the consumers use the Schema corresponding to that ID to deserialize the data.

What Next?

In Dr Who, humans were incompatible for an “upgrade” by the Cybermen if they objected to the process—how about Schemas? How do Kafka/Karapace Schema compatibility, evolution, validation, and versions work? And how do you use Schemas and Karapace in practice? 

To find out more, tune in for the next exciting episode for the continuing adventures of “Dr” Paul and the Karapaceons!  

An “alien” looking giant Galapagos tortoise (as well as crustaceans, tortoises, turtles, and even snails all have Carapaces)

An “alien” looking giant Galapagos tortoise (as well as crustaceans, tortoises, turtles, and even snails all have Carapaces)

(Source: Shutterstock)

Follow the Karapace Series

  • Part 1—Apache Avro Introduction with Platonic Solids
  • Part 2—Apache Avro IDL, NOAA Tidal Example, POJOs, and Logical Types
  • Part 3—Introduction, Kafka Avro Java Producer and Consumer Example
  • Part 4—Auto Register Schemas
  • Part 5 —Schema Evolution and Backward Compatibility
  • Part 6 Forward, Transitive, and Full Schema Compatibility 

To learn more about our Managed Apache Kafka

Contact Us