On a previous episode of this podcast, I said that writing your own Kafka library from scratch would be insane. Well, today, I talked to a guy who is doing just that. It's Tommy Brunn, and he's one of the lead developers of KafkaJS. That's a JavaScript Kafka library that's built from the ground up, not based on librdkafka or anything like that. We talk about special issues related to JavaScript, and just what it's like to build a Kafka library in this way. It's all on today's episode of Streaming Audio, a podcast about Kafka, Confluent, and the cloud.
Hello and welcome to another episode of Streaming Audio. I am, as per the usual, your host, Tim Berglund, joining you in this relatively new video podcasting format. New to us anyway, I know everybody else has been doing this forever, but we are happy to get cameras, and YouTube and all those things involved. Joined in the virtual studio today by Tommy Brunn. Now, Tommy is a lead engineer at a company called Klarna, and he is building what he describes as a runtime platform for developers. What's interesting, there's going to be some Kafka involved, of course. And what's super interesting about this is that he's building it in JavaScript, and in Node.js. Tommy, welcome to the show.
Thank you very much and thanks for having me. Actually, what's more interesting than all of that is the fact that Kafka is not involved at all in what I do at work. But in addition to all this stuff that you talked about, I'm, of course, also, one of the developers of KafkaJS, which is the-
Yes.
...Node.js Kafka client-
Oh, you know what?-
... One of them, at least.
...I put that together wrong. I thought that you used that at work, but that's not true. You're just one of the guys who's doing this.
You'd think so, right?
Yeah. Usually that's [crosstalk 00:01:54]
Very few people write Kafka clients for fun. But no, I mean, [crosstalk 00:01:59] you're half right. Yeah. You're half right, though. I used to work with KafkaJS in a professional capacity using it at work, but I recently sort of pivoted off and did something completely different where, as you say, I'm building a container runtime platform. So a way for developers to run their services without having to be experts at the Cloud and all that stuff.
Nice. Nice. Which is something that a lot of people will appreciate. But we don't talk about JavaScript a lot on this show. I want to be careful to value judgments, but one might say we don't talk about JavaScript enough on this show. As I understand the statistics of client library usage, obviously, Java is I think only a slim majority, if I'm remembering the numbers correctly. It's not the overwhelming majority you'd think it would be, it's the overwhelming majority of the conversation, but there's also a fair amount of .network, and some Python, a little bit of Go. There's a lot of JS too. So [crosstalk 00:03:07]
Yeah. If you sum up all the librdkafka users, then that's probably a big chunk of them. But in the Node.js world, we were historically kind of underserved or maybe there wasn't that big of an audience for it. There is no sort of official Kafka client for Node.js, but there are a couple of [inaudible 00:03:31] ones and there is a set the bindings to the librdkafka.
Tell us real quick, before you go on, if a listener doesn't know what librdkafka, this is Kafka but with political freedom.
Exactly. Yeah. This is another Kafka library that is also built by Swede. That's our main export here Sweden.
Husqvarna, Volvo, those are actually small businesses.
Side businesses, you could say, just hobbies. The real thing is in Kafka clients. Yeah. So librdkafka is a C client that is awesome and it's used by tons of other languages that are linked to it. So if you're using Kafka in Python or in Go or most languages, then probably you are using librdkafka or at least something on top of librdkafka.
In case anybody here is new, that's lib-R-Dkafka, when you say it quickly it sounds like the English word liberty. It is not that.
Yes.
It's librdkafka and that is written by a Swede named Magnus Edenhill, who's a super great guy.
Indeed, yes. So for Node.js, there were bindings written by someone who used to work at Blizzard but they were kind of abandoned in 2018 or so and haven't really kept up with the times. And so, if we rewind the clock a little bit back to 2017, when KafkaJS was born. It was born because there weren't really any great options for Node.js developers. So at Klarna, we have tons and tons of microservices that communicate at least to some degree over Kafka. And many of those are Java services and they use the official Java clients, of course, but then there is also a huge chunk of Node.js services.
So we were kind of faced with, okay, are we going to use these poorly maintained community clients? These bindings that they link to a great library but there's a lot of pain in using native add-ons in Node.js. Or should we do what I think you described in your conversation with Magnus as something completely insane and write our own client?
Yes.
So, of course, we chose the third option, or rather I didn't actually make that choice. It was my friend and former colleague, Tulio, who started the project back in 2017.
This is no colleague and former friend.
Well, he might be if he doesn't review my [inaudible 00:06:14] request quicker. So he sort of took it upon himself in the summer of 2017 to start saying, "How hard can it be to write the Kafka client? It's just a client after all."
But now in 2017, the so-called Blizzard library was sort of the de facto standard.
More or less, yeah. It was either that or a library called Kafka node or node Kafka. I never remember which.
Yeah. Both of those are librdkafka based, right?
No. So Kafka node is a JavaScript client. I think that they have some sort of switch where they can either use librdkafka or their own thing, but I'm really not sure.
Okay.
But actually, back then, we couldn't use either even if we wanted to, because we used SASL authentication at Klarna. And that was new at the time and it was not supported. At least maybe librdkafka supported it, but the bindings didn't.
Sure.
And Kafka node didn't support it.
And that's kind of the struggle with any non JVM Kafka interface, you'll get features like that. And it might take a year. I mean, the Idempotent producer just recently made it into librdkafka and full of EOS support, I think just in the last six months. I don't keep up with their data that much, but that happens. So, anyway...
So anyway, that was kind of the linchpin for why do we do this? But of course, the first step was to try to add that to the existing clients, but for the bindings, none of us are really C++ developers, or C developers for that matter. So doing that, not super attractive. And the other alternative was trying to contribute it to the Kafka node. But at the time, Kafka node was not in a great shape, it didn't look like something that we wanted to heavily invest into and really sort of bet the firm on. Because if we're talking about hundreds of services at Klarna here, so we sort of decided that the least risky thing is to do the insane thing. [crosstalk 00:08:17]
Do the insane thing. Yeah. Okay.
Yeah. And here we are, three years later.
The right protocol is public so we can do this.
Exactly. And I think it worked out pretty well. At Klarna, I don't know how many services we have using it, but I would bet it's probably close to 100 or...
Oh wow.
Yeah. Something like that.
Nice.
And then outside of Klarna, who knows? But it's 200,000 downloads a week or something like that. So it's probably a few.
Okay. That's outstanding.
Yeah.
Yeah. Wow. No. That's quite a bit of uptake. I knew I was going to learn things today and this is going to be one of those episodes where I'm going to learn a lot, being not a JS guy.
Right.
There will be discoveries here. So I don't know. I mean, walk us through the process of developing it. I'd like to know what was hard, what was easy, what's the crazy 10% of the project that takes up 90% of the time.
So the good thing about sort of trailing behind the Java client is that we have a trail of breadcrumbs to follow. So a lot of the time our work is basically seeing... We want to... I don't know, improve rebalances, make them be less impactful. Okay. Let's see what the Java client is doing. Okay, well, there was this KIP that did this and this KIP that did this as Kafka improvement proposals, I think.
Even recently in the last year, they've been some KIPs that have dealt with rebalances.
Exactly. So basically, we will just need to look at those and go, "Okay, this is how it's implemented in Kafka. What do we need to do to adhere to that same principle and idea?" And of course, sometimes there are also things that have nothing to do with KIPs when it's more like, "How can we make this more idiomatic for Node.js users? How can we make it more easy to use?" Because that's also another problem with using the librdkafka bindings, is that the interface doesn't feel like it was written for JavaScript, which obviously is because it wasn't. It was written for C and then there's an interface that tries to kind of glue it together, but it's never going to be great. So that's a lot of the work as well.
Tell us some stories about how it's gone, what's been particularly challenging?
I think the highlight of the last three years was actually when we... So our audience may not have used Kafka before and they might come from a very different background than most people that work with Kafka. We get a lot of people who work in front end development, who for some reason, are looking at Kafka.
This is the story of Node.js, Right? It draws in people who are kind of born and molded in the front end and suddenly they're in the backend.
Exactly. We got this issue on GitHub where someone was asking, "Hey, why does this crash when I try to run it?" And when we look at the stack trace, we can see that this person is trying to use KafkaJS in a browser. Correcting directly from a web browser through Kafka and initially, we were just laughing and going like, "What is this person doing? Why are they trying to connect first browser?" And yeah, but then after a while we go, "Yeah. But why wouldn't they work? I mean, how crazy can it be?" So we decided to just try it. So we did this thing where we took the KafkaJS source code, and we piped it into a kind of a compiler called Browserify. And what it does is it tries to find all the modules and stuff that are node specific and swap them out for a browser implementation with the same idea.
Something that's available in the browser.
Yeah. Exactly.
Sort of bundles everything together and takes a node module. It makes it so you can use it in React or whatever.
Yeah. Exactly.
[inaudible 00:12:31]
Uh-huh(affirmative). See, see. You can be a front end developer if you try it.
Basically, I am one because I kind of know what Browserify does.
Anyway, so that got pretty far actually, but it did barf eventually. And it barfed when it tried to find a replacement for the net module. So the net module in node.js is what you use to read sockets and the reason, it's the communication with Kafka from the Kafka client is over TCP sockets.
You may need a socket for that.
Yes. You may need a socket. So Browserify basically just threw its hands in there and said, "There is no such thing in a browser. I cannot help you, sir." [crosstalk 00:13:15]
And probably due to a security model. I mean, that's not an API that's going to.
Exactly.
Yeah.
There is a proposal, but I don't hold out hope for it. But anyway, so again, we went like, "Okay, I guess this doesn't work." But then there was a little light bulb that came on and there is this thing called a WebSocket and it has the word socket in it.
It does.
So if you squint, it looks almost like a socket.
I mean, it's a network connection that data moves over?
Exactly. Yeah. So what we did was we made a way to swap out the network layer so you can provide your own implementation of the thing we use to great sockets. And then we put up a tiny, tiny little proxy server in between that accepts WebSocket [crosstalk 00:14:03] connections on one side and then outgoing TCP sockets. And so, we put that up, we compiled a KafkaJS with this special fake net module or a WebSocket, the net module. And it worked loaded up in a browser, open up the console, see your Kafka consumer start in your Chrome DevTtools. Open up another tab in your browser, it joins the consumer group, is absolutely insane. I mean, we couldn't believe it. We thought it would never work, but it did. Now, why would anyone want to do this?
I was going to say, is that just a science fair project? Because if it is, I would like to spend some time admiring it, but where do you go with that?
Well, the thing is, so we have received a lot of people that I've asked about this and not necessarily in a browser, but maybe from a mobile app, for example, but the sort of model of connecting directly from a server to Kafka. I don't know. I think you can challenge it. I don't know if it's a good idea, but it's interesting to explore and see what does that give us? Yeah, there's also the use case of not necessarily end user applications, but for example, control panels for management of Kafka or Kafka topics. You could perhaps connect directly to Kafka there instead of having to build the custom [inaudible 00:15:23] in between.
You need a service that you talk to and it really is just being a proxy and all that. And certainly the developer tools in the front end are rich enough. I mean, if you got React in there or whatever, it's an application development platform. You don't need all that, that's really interesting. And that also, if it's an admin thing, I think the scale is something that's going to freak people out. Right? Because, I mean, not the whole idea of the web is that there's this rather significant fan out.
Yeah. You probably don't want to tweet the link to your Kafka cluster. That's a...
Right. Right. But an admin tool where there's authentication and it's some tens of users or whatever like that, that's fine.
It could be an option.
Yeah. And not that number of producers and consumers is limited to tens. I'm just saying, $100 million might be a bad idea.
Might be a stretch.
I'm just putting that out there. What else? As you're doing this, what about the protocol? I'm wondering if there's anything that you see in there as you're implementing it where you're like, "Wow, this is brilliant. What a good design decision." And anything, respectfully that you're like, "Why are you the way that you are?"
Actually, that's kind of personally, my favorite part. To some degree, is working on the protocol level. It's such a different thing from what I'm doing normally, since it's really fun, actually. But then there are definitely some things that I've yet to wrap my head around, right now, I'm trying to figure out how to implement support for tagged fields, which is a feature that was added in the protocol, not too long ago, maybe a year or two years ago, something like that. And it works completely differently from how everything else in the protocol works. The intention is to be able to dynamically add fields. I think of them kind of headers in HTTP where you can send sort of anything in a header and it can be there or it doesn't have to be there.
That's kind of how I think about it. And it's useful for some cases, but implementing it, it's going to be different from the entire rest of the protocol, but it's pretty fun. And it's one of those neat development tasks where the objective is totally clear. Most of the time in software development, we're figuring it out as we go, we try something and we change it. The product doesn't work and you change what you're going to do. Protocol development, it's straightforward. You have a spec and you have a buffer on the other end, comes an interpreted thing or the other way around.
You're so right. I started my career in that kind of work, firmware and data communications and things like that. This kind of had to dawn on me when I moved over to business software and the web years later. That in just business software, application development, you're building something to facilitate some kind of work that people do. So here's your machine that you're building and the process that underlies that machine is a mental one. It's people figuring stuff out and thinking and learning and whatever it is they do. And business is a cooperative, mental phenomenon.
People decide and act and communicate. And we do that in all kinds of fuzzy ways where there are strange exceptions that are implicit and you don't even realize, "Oh yeah. Well, no, when it's like that, you got to do it this way." And that's why code is hard, right? That kind of software is hard and you can't make it beautiful and you can't make a perfect machine because you have to have hard coded, special cases that you hate and all that kind of stuff.
Exactly.
That's kind of the frustration of business software and frankly, why it's difficult and it takes experts to do it is that you're mapping machine to mind. To put it philosophically.
Yeah. That's why it's so, so enticing about this.
It is. Protocols, it's all machine. I mean, there could be weird things. There can be stuff you don't like. There could be, I mean, there are difficult protocols I wrote, I won't even say this is dating me because it should be obvious that I'm old enough to remember this. But I wrote EDI code once, which is an old data communications protocol that I never really wrapped my mind around. It was just so bad. I was wanting it to be XML and it wouldn't and so there are protocols that are horrible, but it's always just machine all the way down. And I just appreciate that you're kind of saying it was refreshing to do that kind of work.
Yeah. In a way.
It's still is.
But on the other hand, I mean, working on an open source project that you were kind of the steward of is also very much a human activity and very much a product driven activity. It's not all just bites and code. A lot of it is managing or not managing, but shepherding the project in the direction you think is good, and managing contributors and making sure that people feel welcome and that they're...
Absolutely. Fundamentally social and in bigger projects, dealing with people who don't make people feel welcome, you got those important exceptions. So even if you have this little mechanistic Solus of parsing buffers in a way that it's never ambiguous and no one will ever tell you you're wrong when you get out to collaborating with other developers. Then that becomes the social phenomenon, [inaudible 00:21:07] software development is.
We've been very lucky though. We have a fantastic community of people that are super helpful and a lot of good contributors. Would say that these days, even most of the big sort of changes that happened to KafkaJS are not written by me or by the other maintainers. They're written by community contributors. And then our role is more to help them get that into production, make sure that it fits with the rest of the library and make sure that they get it all the way done.
There're tests and all that kind of thing.
Exactly.
So you're shepherding other people's contributions?
Yeah. I would say that's the majority of the work nowadays.
And that makes sense for this kind of work for this kind of software library, the economics of open source just apply in spades, right? Because anybody who uses it, well, there's a feature that you haven't implemented yet. And they want to make the investment and they do, and you help them get it going.
Yeah. That's why we had transactions and Idempotency a year before librdkafka.
Oh what's up, Magnus? Okay. I like it.
Exactly. Light down the gauntlet.
The gauntlet has been thrown. The Swede, by a guy with Swedish last name. I like it. I like it. What else? You said this doesn't touch your work directly. You're doing other things
Oh my gosh. I mean, there are, of course, so many teams at Klarna that are using this, that inevitably some of the time I will be helping them out with stuff. But my day-to-day is completely separate from KafkaJS stuff for better and worse. The good part of it is kind of that I get to switch my mindset to learn something else, but the bad part of it is obviously that it's really hard to make time for making these impactful changes when that's relegated mostly to nights and weekends and vacations.
Right. Right. That is the basement hacker kind of [inaudible 00:23:05].
Exactly. I think that's part of why most of my work nowadays is in shepherding and helping others and providing support because you can do that in an hour, but you can't write transaction support in finding increments of one hour.
No. You can't.
You need to actually sit down and work for five hours.
Yeah. [crosstalk 00:23:27] I was going to say maybe four increments of five hours might do it. But 20 of one, you're not going to get any of that work done, especially for transactions because that's going to be something that requires deep focus.
Yes. Exactly. So that's the downside.
If I could phrase this negatively, how far behind the current 2.7 JVM driver are you? Are you a parity? Or?
No. We're not. It's really hard to say because it's not a straight line. I would say we are maybe around two years behind if you put it like that. So the features that were sort of new two years ago in the Java client, that's probably where we're at. But then there are some parts where we're almost a parity or at parity, and then there are other parts that we haven't even looked at. But I think that at this point, KafkaJS, it's a good low level client. And when I say low level, I don't mean C, but I mean, we provide the base client, but we don't have an equivalent, for example, Kafka Streams or-
You anticipated my next question.
...[inaudible 00:24:34] something like that. So I think that there's a huge space there for a lot of work to be done in terms of building stuff on top of KafkaJS. So something similar to Kafka Streams would be awesome. Right now, if you work with Kafka and you want to join two different topics, you can do that. But it's a lot of manual work. It doesn't feel good.
Yeah. No. It does not feel good. You want to be in that business.
No. You do not. Well, a lot of people want that's why we need the Kafka Streams or something like it, but the same goes for things like... I've been thinking about dead letter queues. So I want that to be sort of automatic. I don't see why everyone should have to reinvent dead letter queues that should just be a message able to be processed, put it in a different queue to be reprocessed later.
At least optionally, you could have the library do that for you.
Exactly. Or a library on top that does it somehow. And tons of other things like that, that I think could be done and would be a lot of fun to do, but they require those five hour increments of four.
They do. Kafka Streams, JS would be a significant investment. That's kind of the bummer, it's not like you build Kafka Streams and there's this shim that binds it to a language runtime. It's just not like that. You rewrite the whole thing and that's big. Now, eventually connect. I feel okay about that because that's just infrastructure and it's got a REST API. A little wrapper for that might be nice, but you wouldn't need much. I mean, people do REST APIs in JavaScript, it's proven functionality. Everybody is good at that.
I think streams is definitely the thing that we're missing the most because we do get requests about stuff like that a lot. And now it makes total sense.
It really does. And you still have case equal DB, which is, I think at the point where... Again, in a given application, they might not be architecturally the way you want to solve the problem, but you can deploy a case equal cluster or running Confluent Cloud or whatever. And do that stream joining and get your joint stream. You can do your stream processing over there and not have the burden of that locally. But, I mean, it's hypocritical of me to say, "Therefore, you don't need one because in the Java world, we're quite pleased to have a Kafka Streams. Thank you very much for Java-based microservices." People use that thing. So they get that. So this is Tommy's call to action for a team to assemble and write Kafka Streams JS. I know it's been done in part and it hasn't been done. It's been started. So there's stuff to build on there.
Yeah. There is a library called Kafka Streams in Node.js, which is a community sort of thing. But I think you really do need some corporate investment or a sponsor of some kind to really continue development on these things. Because no, when you build something on top of something that is so core to your application as Kafka Streams is you need that to be supported for the next X years. You don't want to end up with a dead library.
And frankly, it's simply hard enough that it's going to require people to get paid. And that's just kind of the economics of open source. At some point, there is a company that has an interest in this thing existing and being free and they fund developers. And so, that would be fun if that happened.
It would be wonderful.
Best way for people to become aware or learn more, obviously, we're going to link to the GitHub repo. What do people need to know?
They can go to kafka.js.org. That'll take them to the website where they'll find information on getting started or more advanced topics as well. There's also a Slack community, which is linked to from the website and where you can join. We are around 500 or so KafkaJS developers in there. So feel free to join that and ask questions or get support.
Okay. Excellent. That's fair.
That's friendly gain.
And you also shouldn't feel shy about asking on the Confluent community forum, forum.confluent.io. Probably not peopled by a large number of JavaScript experts yet, but either of those places, the confluent community forum, it needs to be a friendly place for JavaScript questions. You are welcome and it's not Java and Python only. So don't feel like it is.
Great.
My guest today has been Tommy Brunn. Tommy, thanks so much for being a part of Streaming Audio.
Thank you for having me.
And there you have it. Hey, you know what you get for listening to the end? Some free Confluent Cloud. Use the promo code 60PDCAST that's 60PDCAST, to get an additional $60 of free Confluent Cloud usage. Be sure to activate it by December 31st, 2021, and use it within 90 days after activation. Any unused promo value after the expiration date is forfeit and there are a limited number of codes available. So don't miss out. Anyway, as always, I hope this podcast was useful to you. If you want to discuss it or ask a question, you can always reach out to me on Twitter @tlberglund, that's T-L-B-E-R-G-L-U-N-D. Or you can leave a comment on a YouTube video or reach out on Community Slack or on the Community Forum.
There are sign-up links for those things in the show notes. If you'd like to sign up and while you're at it, please subscribe to our YouTube channel. And to this podcast, wherever fine podcasts are sold. And if you subscribe through Apple podcasts, be sure to leave us a review there that helps other people discover it, especially if it's a five-star review. And we think that's a good thing. So thanks for your support, and we'll see you next time.
At Klarna, Lead Engineer Tommy Brunn is building a runtime platform for developers. But outside of his professional role, he is also one of the authors of the JavaScript client for Apache Kafka® called KafkaJS, which has grown from being a niche open source project to the most downloaded Kafka client for Node.js since 2018.
Using Kafka in Node.js has previously meant relying on community-contributed bindings to librdkafka, which required you to spend more of your time debugging failed builds than working on your application. With the original authors moving away from supporting the bindings, and the community only partially picking up the slack, using Kafka on NodeJS was a painful proposition.
Kafka is a core part of Klarna’s microservice architecture, with hundreds of services using it to communicate among themselves. In 2017, as their engineering team was building the ecosystem of Node.js services powering the Klarna app, it was clear that the experience of working with any of the available Kafka clients was not good enough, so they decided to perform something similar for the Erlang client, Brod, and build their own. Rather than wrapping librdkafka, their client is a complete reimplementation in native JavaScript, allowing for a far superior user experience at the cost of being a lot more work to implement. Towards the end of 2017, KafkaJS 0.1.0 was released.
Tommy has also used KafkaJS to build several Kafka-powered services at Klarna, as well as worked on supporting libraries such as integrations with Confluent Schema Registry and Zstandard compression.
Since KafkaJS is written entirely in JavaScript, there is no build step required. It will work 100% of the time in any version of Node.js and evolve together with the platform with no effort required from the end user. It also unlocks some creative use cases. For example, Klarna once did an experiment where they got it to run in a browser. KafkaJS will also run on any platform that’s supported by Node.js, such as ARM. Klarna’s “no dependencies” policy also means that the deployment footprint is small, which makes it a perfect fit for serverless environments.
EPISODE LINKS
If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.
Email Us