"This is why Kafka cannot be regarded as a database; not without twisting the basic definition". While I agree with this statement, saying that kafka is not a database because it doesn't have all the bells and whistles of a traditional database is dangerous and feeds management cronies who don't believe kafka can replace a database which is false. So even if it's not 100% true (yet) I still thinks it makes more sense to call Kafka a database to make it more accessible.
The Kafka streams api uses local state stores backed up by a change log topic to maintain state. Since Kafka works on the level of bytes the state stores also require that this level of granularity be used. The translation between the domain language and the Kafka world is the responsibility of the Serde. This works alright but does impose a certain performance penalty since reading and writing needs to “always” go through the Serde, especially when using heavy Serdes like Avro. …
If you have spent any significant time with Avro (or Protbuf) and are using the Confluent Schema Registry you probably have encountered a breaking schema change characterized by the following mysterious exception.
“message”: “Schema being registered is incompatible with an earlier schema”
What happens is the schema registry validates the schema sent by the producer against the “version” that is stored in the schema registry for whatever schema evolution “strategy” that you have set, be it either Forward, Backward or None. The reason that I have version and strategy in quotes is because there is no…
Kafka is most likely not the first platform you reach for when thinking of processing batched data. Most likely you’ve heard of Kafka being used to process millions of continuous real time events such as twitter feeds or IoT feeds not running end of day batches from old mainframes. Does that mean Kafka should not be used to process batched data? As in everything related to software engineering. It depends.
First I think it’s wise to point out some misconceptions about Kafka. Kafka isn’t just a message queue. It’s much more than that. It’s a continuous stream of change events…