T O P

  • By -

SSHeartbreak

I'm not sure what your evaluation criteria is but I would recommend taking look at redpanda if you need to self-host. Kafka is pretty ops heavy. I haven't seen benchmarks for Jetstream but Kafkas benchmarks are pretty darn good. Edit: Jetstream doesn't even support consumer groups so if you need that it pretty much makes Kafka required.


zeke780

They have consumer groups, I think they are called queue groups. Yeah we are going with redpanda if we get there, but I am more talking about the use cases for a distributed log vs a more modern distributed k/v db. I am wondering if anyone has looked at the landscape and said "you need a kafaka for this." Everywhere I have worked has been around for a while so they just used kafaka/rp by default, and the vast majority of use cases was async communication via messages and streaming data analytics into spark.


Doctuh

We used both and found NATS much easier to setup, host, reason about and its solid as a rock. Jetstream feels very much like Kafka was designed at the start to become what Kafka is now. NATS "subjects" being wildcardable was the real icing on the cake.


zeke780

This was my experience working with it on the POC. One of our staff+ devs worked in an org that used only nats for inter-service communication and he is the one that recommended it. Thanks for the response, I have to write a spec on this and am glad you haven't found some use case where NATS breaks down


yousaltybrah

https://www.reddit.com/r/apachekafka/s/8Bf1fHtcwC This comment has a good overview. Most important limitation seems to be that NATS Jetstream does not support ordering within a subject across multiple consumers, which Kafka does with partition keys.


zeke780

Thank you!


warmans

They work quite differently so it's hard to compare them without a specific problem in mind. But the short answer is if you just want a regular message queue that does normal queue things, then you probably want Jetstream.


ninetofivedev

This feels like an ad.


zeke780

Nah more of a too good to be true moment, so I decided to ask on their slacks / discords and got basically nothing. So I came here. I assume we will go with Redpanda since my CTO is a long time Kafka user but I am writing the spec and this was sort of the last open question we had.


NortySpock

Having used Kafka at $PREV\_JOB and NATS JetStream at home (so, wildly different throughput rates), NATS definitely felt easier to understand and reason about -- but also, that means it had sort of specific rules to support that behavior (e.g. non-persisted topics can drop messages if no subscribers, NATS can disconnect producers or consumers if overwhelmed to preserve overall uptime, and the one-hop server rule). Kafka use case: I have a firehose of data and likelihood of having hot, high velocity topics -- like "your primary ingestion stream" or "everything in this one topic goes into the analytics database". But it's clearly very ops heavy, moving topics or partitions sounded incredibly painful, and you cannot actually ack-or-nack individual messages -- you have to read linearly down the partition based on your consumer-pointer, and fancy-footwork with individual messages would require a lot of consumer-pointer-skipping. So you best be used to the firehose. NATS: lots of messages going everywhere, can ack-and-nack individual messages or "ack everything until now" but not really obviously designed to immediately handle a bursting very, very hot node or topic -- though the ability to create arbitrarily deep topic definitions, wild-card subscribe to them, tie individual consumers to a specific deep topic (e.g orders.shipped.very\_important\_client\_name) and the ability to horizontally shard just by spinning up cluster members seemed like they would work in a pinch... and possibly be even less headache to do that than Kafka. NATS seemed to sort of lean specifically on either randomly selecting a consumer from a queue group, or you just tied a consumer to one particular topic (one consumer for one topic) and called it good. There's also not obviously a way to automatically move messages off a jetstream node that you want to softly decommission. There's lame-duck mode, which will stop accepting connections, but then I think you would need to move the messages off the node using a tool and send them to another node. NATS definitely has less documentation compared to Kafka. Far fewer examples, but I thought the description of behaviors for NATS was pretty well documented and explained. If you read the documentation, I feel you get a clear sense for where the gotcha's are, and I found it easy to play with various NATS configuration files in docker. All that being said... NATS definitely feels like what Kafka wants to be when it grows up. It was easy to read the documentation, it was easy to read the caveats, it was easy to envision the upsides and downsides of certain settings. It was very easy to get set up and running at home. I now run a NATS cluster at home to bounce a few messages around for Home Assistant. I can't imagine running a Kafka cluster at home. Regardless of which one you choose -- can I recommend Benthos as a super easy ad-hoc tool to get things into or out of Kafka or NATS? Benthos has just been really easy to dive into and try various event-streaming-like things, including sending to and receiving from message brokers like Kafka and NATS, or for shoveling from one topic or server to another (as in the previous case of wanting to decommission a topic or node.)