What is the Edge, and can you run Kafka on it? Well, the answer to the second question is yes. The answer to the first question, maybe you should just listen to Kai Waehner and I talk about that on today's episode of Streaming Audio, a podcast about Kafka, Confluent, and the cloud.
Hello and welcome to another episode of Streaming Audio. I am, as ever, your host, Tim Berglund. And I'm joined today by returning guest, Kai Waehner. Kai, welcome back to Streaming Audio.
Thanks for inviting me again, Tim. Glad to be back.
You got it. Always happy to have you. For our new listeners, tell us what it is you do here.
Yeah. Actually, I've worked for Confluent for four years now. So a long time. And I've worked with customers across the globe, and for that, I see a lot of different use cases and architectures, and that's also what I share in blogs and on Kafka Summit talks, or here in the podcast. So I'm glad to be back here.
Cool. I want to talk to you about Edge today, because that's the thing you've been writing about a little bit recently, and it's sort of been on your mind. And I always like to keep tabs on what you're thinking about. So first of all, a definition. I mean, it's not exactly a new term, but it's one of those terms that feels a little fuzzy. So for people who aren't familiar with Edge, what does it even mean?
Yeah, exactly. So it's not a new term, but it's used more and more. And this is also why we are working more and more on add-ons. So it's really important to define in the beginning. And for the definition of this conversation today, I define at the edge, if something is really outside a data center. So all the processing logic and software is not running in a data center or in the cloud, but really closer to the edge, which can be a oil and gas field or a smart factory or something like that. And also important, as we talk about Kafka here, I really mean that we also deployed a Kafka cluster at the edge. This is the important piece here. We are not just connecting from the edge to the cloud, but really deploying Kafka completely at the edge. And this is actually something what is coming up more and more with our customers, and that's why I'm excited to talk about this more today.
Cool, cool. And I guess not in the data center and, in some sense, infrastructure, so this does not include my laptop. This does not include my phone. And we don't think of those as edge devices.
Yes, exactly. So actually, really the edge here is really the limiting factor is the hardware. So this is really still low footprint and low touch. So you're talking about hardware things, which are something like four gigabyte of RAM or maybe eight gigabyte of RAM. It's still enough so that you have a small kind of server thing that you can run Kafka, because otherwise the server side would not work. But it's really very limited in how much things you can deploy there. But then on the other side is also the big advantage, why more and more people use Kafka here because, as most of the audience might know, Kafka is not just a messaging [inaudible 00:02:56], but also for the storage and for the integration and processing. And so you can do many things with one solution at the edge. And that's one of the key reasons why we see so much Kafka use cases at the edge now.
Yeah. It's interesting that this has become a thing recently, that edge is a category. Because I'm thinking back to the very beginning of my career as a software developer. In fact, I'll just tell this. This sounds like I should be in a rocking chair with a blanket over my legs, telling the story of back in the day. But the first business trip I ever took was, I worked for a company in central Florida at the time, and I went to Dumas, Texas, in the Texas panhandles, about an hour away from Amarillo, because there was a refinery there. That refinery was a customer. I worked for satellite communication startup. And the whole point of the deployment was to connect SCADA devices, which were PLCs, Programmable Logic Controllers, to a central network.
Now, of course this was 1994, there was an internet, but you didn't do stuff like this over the internet. There were other data communications networks that you'd use. But that was an edge device. Right? And so this is not a new idea. There have always been computers out places. Not always, for a long time, connected over networks. Edge kind of formalizes that idea, right?
Yes, absolutely. I mean, the difference to 10, 20 years ago is that there is not just a little bit of local processing, like in a factory, but it's really about doing much, much more with the data. And a similar story like we see from the tech companies from Silicon Valley, you'll get much more value out of that if you can collect the data and use it more and more. Like replicating some of the data in the cloud to do analytics and machine learning and train models, but then deploy that back to the edge to do a real-time processing there. And hence, we really see more and more of these connectivity. And actually when you think about back 10 years, or even today in most of the industrial deployments, the things are not connected to the internet. And this is now where it's changing. So you're connecting more and more things to each other, and some of them still only at the edge, but you need to do more processing of that to get more value out of this.
And this is the key reason why we see so much more compute power needed at the edge. So it's not really new, the term, but it's really new of what you can do with that today. And that's the difference.
Because it's gotten much cheaper to deploy that computing power, that compute at low power consumption, and just low capital cost [crosstalk 00:05:31].
Yes. And because, I mean, when you go to the industrial environment, the machines and sensors and PLCs, they provide more data they didn't provide in the past. Or on the other side, if you go more to the consumer business, like we are working with retail stores, with restaurants, here it's simply about building new services which were not possible in the past. Like when you have, for example, upselling or cross-selling, and these kinds of things are happening now with so-called edge analytics. And the key reason why you want do this at the edge is either low latency. So it doesn't work well if you have to replicate all the data to the cloud first. Or it's a bad internet connectivity. That's like what we hear from all our retail customers. They are in malls and the Wi-Fi is pretty bad, and you cannot always [inaudible 00:06:11] communication to the cloud before you want to do a real-time recommendation while the customer's walking through the store.
And these are the reasons why this edge comes up more and more to do really the processing locally. And the third part, in addition to latency and also security, is that simply, as it gets more and more data, it gets too costly to do all of that in the cloud, because these sensors produce so much data. And at least people do simple logic like filtering the data, because in the cloud you maybe only want to consume the relevant data, like alerts or changes of information. But sensors at the edge continuously process data. And you don't want to replicate all of that to the cloud. So cost is also a big factor here why you want to at least pre-filter and pre-process it at the edge.
Got you. Because you got network ingress. I mean, that is, as they say, how they get you-
... in the cloud. And storage costs you're going to have there. So okay, that makes a lot of sense. And yeah, thinking back, so this is 25 years ago, these Allen-Bradley PLCs, I never actually did anything. I never did any so-called ladder logic programming on them. Even then, in the early 90s, mid-90s, those were antiquated. I mean, they looked like an old technology then. And these are eight bit microcontrollers in there. It's all they can do to read and analog to digital, a few [inaudible 00:07:34] and spit stuff out a serial port, that was their job. But now we've got all the compute in places where you can suddenly do meaningful things. Even meaningful things like run Kafka, which is a little bit mind blowing.
Exactly. And also the other part is, even today, this still exists, right? That's also the difference from software development like we know it. And contrary to the industrial IoT [inaudible 00:07:56], these machines run for 30, 40 years and they don't change it quickly because it's not cost efficient. And hence, one reason why Kafka is running at the edge now, it's not just for processing data, but really also for the integration part and collecting all the data. Today most manufacturers don't even collect the data because they have no technical capabilities to that. So the first use case in the industrial IoT is typically put Kafka there, collect all the data and create some dashboards. That's something which is obvious on a web application. You can do that easily. But in the cloud, in the edge, it's not easy like that. And therefore this is the first step. And afterwards, then you can do the fancy things like predictive analytics or image recognition on the production line, and then to these more advanced use cases afterwards.
I want to see a Kafka connect source connector for some RS-232 protocol now. I just feel like I need that to connect myself with my past. That would just be, I think, beautiful. Would bring a tear to my eye.
So you've talked about industrial things, but actually walk us through some use cases. What do you see [crosstalk 00:08:53]?
Yeah sure. Absolutely. So I think we should focus on two different kinds here really, because edge and IoT, that's both industrial IoT and consumer IoT. And so I want to cover one for both so that everybody can see what's the advantages. And let's start with the industrial IoT. So this is really where it's about manufacturing, and this is about these PLCs and proprietary protocols and legacy technologies. And one customer we have here is WPX Energy. So this is from the oil and gas industry. And what they are doing, they are rolling out Kafka at the edge on the oil and gas field. And how that works is that they actually use hardware which is explicitly built for these kind of edge use cases where it can also work if the environment is not stable and doesn't work well. And they connect Kafka to the sensors at the edge, and collect the data there.
And then they use it to do pre-processing, but also for deploying business logic there, to act when something is happening with the sensor data. Like to spot anomalies, for example. So these are the common use cases where you can process the data at the edge, like in the oil field, without communicating with the cloud all the time, because you'll have bad networks anyway. And at the edge in general then, integration is part of the problem here. So partly, if you're lucky you can do this with open standards, like MQTT. If you have more modern PLCs they also support open standards. Or OPC UA is used a lot. But in the real world, and often together with Kafka, and also to be very clear here, we are working often with other partners which provide the last mile integration with the OT world. Because Kafka, we have also connectors, for example, like PLC4X, which is an open-source framework to connect directly to PLCs like [inaudible 00:10:33] or [inaudible 00:10:34]. But for the production usage then, typically you go to another vendor which does this for 20 years.
And so here at WPX Energy really, these boxes are running in the field. And the interesting part is also, we know Kafka only is a distributed system, right? So it's at least three brokers and highly available. And that's how you normally use it. At the edge, we see many deployments where actually it's only deployed as a single broker, because resiliency is often not the most super critical thing. Because if you want to deploy 100s or even 1000s of these edge things, then it gets very costly. And often it's good enough to have just one of these broadcasts there. And here still, no matter if it's three or one broker, the huge advantage of Kafka is not just about collecting data with Kafka Connect, but then you can pre-process the data with Kafka Streams or [inaudible 00:11:22], for example. But then the biggest value which I see is that Kafka is also a storage system.
And so it also handles the back pressure if there is no connectivity to replicate information to the cloud, for example. And this is the huge advantage here because, with one simplified architecture, with one product in the end, you can do a lot of things where otherwise you would have to combine many different products. And so this, I think, is really a great example for deploying Kafka really in the field at the edge. And then when we go closer to the factories and industrial IoT it's very similar from the architecture because the architecture for Kafka is the same everywhere. You just need to connect to different sensors or devices and process the data differently. But the main idea is always the same. And that's the huge benefit of using Kafka here.
Boy, just all kinds of pictures in my head right now of how this might look. What kind of hardware, and by the way, I want to come back to the single node thing in a minute. So don't let me forget that. But what kind of hardware is typical? I mean, I imagine there's a broad spectrum of typical, but-
Yes. So in most cases, depending on who we're working both, but in industrial IoT, most companies already have partners which provide this hardware, right? So it's not us. Confluent is just the software. Either the enterprise already has partners which ship hardware to the edge which works there in unstable conditions, or if it's too warm or something like that. Or in other cases we also work with partners. So one example I use a lot is Hivecell. Hivecell is a company also from Silicon Valley. And they produce hardware boxes. They are like a normal computer, but they were built for the edge. So they only have two interfaces. One is power and the other one is LAN cable. Or you can also connect via Wi-Fi. And so this is all they have. And then you ship it there because there's typically no IT experts at the edge.
And if something breaks, then you can replace it with another one of these boxes. And so this is really an important part also. So edge is also very different from an operations perspective. So it's not just about these boxes and Kafka and some other tools on that. But it's, for example, also how do you handle the fleet management of all these boxes? How do you monitor them? And these are the questions typically which are also provided by the hardware vendor or the solution of the hardware vendor, so that you can manage these 100s of boxes running everywhere. So typically it's a specific hardware for the edge which you deploy. From a Kafka perspective, in the end it's running on either bare metal Linux or it's running even on Kubernetes. Sometimes today there is also more these kind of edge Kubernetes versions. So some are called like not K8s, but this is called K3s, so it's a smaller, lightweight version built for the edge.
And so there is plenty of these kind of software solutions now coming in a similar way for the edge. And the same is happening actually also now with Kafka, you mentioned that already with the single broker stuff. So we are also preparing now more and more for that because, as I said, often you don't need the high resiliency and availability, and then you can just deploy one broker there, and also make this easier and more lightweight at the edge.
So this ties into KIP-500 and what that lets us do with the sort of light version of Kafka?
Yes, absolutely. So with that, of course, the operations gets easier, and also the overhead. You need less memory. But also things like creating partitions and all these things. It's getting much, much better after KIP-500. So I've recently seen a first demo of running Kafka with [inaudible 00:14:44], and that's of course the perfect solution for the edge, because it needs less hardware, it scales even better, and you can run it easily. So that's really perfect for the edge [crosstalk 00:14:54] something like Kafka light.
A single JVM process, then-
... running on one of these little extended temperature range, sealed computers that you drop into an oil field or something like that.
Exactly. That's what it is.
So I get two images in my mind. One is an oil field, and the other is a factory. That's the edge in Tim Berglund's head. And I realized there's more to it than that. But hey, oil fields work. Let's talk about fleet management a little bit. I mean, I know that's not a Kafka thing, I'm just interested in how that goes. Because if you've got, I mean, you said like there's 100,000 things out there. You've got PLCs or some SCADA application where you're a power utility or a pipeline or something like that. You've already got an extended physical plant. There is some thing out in the world that is large, and you already keep tabs on it. I mean, it's already necessary for you to get data about what's going on. And there's some network operations center type room where all that information is aggregated. That kind of has to be true, right? If there's-
Yes and no. So actually, that's what people think. But today, reality is, even most factories are not really that digital as you expected. Right? But what you just described, that's like more the small data center. Right? So there it's much easier because then you can also put Kafka in there in a new Linux server. I mean, in the typical industrial IoT world today, almost everything is Windows servers. And it's also not scalable and it's file-based integrations, or there is no Linux in typical factories today. Right? So this is not the IT world we know. These are very different technologies. So for example-
I did not know that.
And this is really why also ... and what the software people which typically use Kafka don't know, so there's a clear difference between the OT world, operation technology, and between the IT world where we typically live in. And so this is also different programming languages. This is different interfaces. You don't have TCP/IP in the factory, right? So this is often UDP for example. And also another challenge is that you can never, ever access a factory from the outside world for security reasons. So always you need to start a connection from the factory side. So in addition to the technologies we were talking about, you also need a few of the new best practices to do that. And this is really challenges also, which we experience with our customers to solve them. So really even if it's a factory, don't think that this is like in the IT world like a data center.
And then, when you go more [inaudible 00:17:17] to the oil and gas field, then it's even more different. Or if I talk now, maybe we now move to the consumer world-
Yeah, I was going to ask about retail.
Yeah, exactly. Because retail, restaurants, banks, that's these kind of companies we are also working with. And here typically there is no infrastructure in place in the beginning. Right? Many of these companies have, for example, a point of sale system in the retail store, but that's it very often. But now these customers need to also differentiate. They need to compete with Amazon. They need to provide customer based services and better experiences, like location-based services in real time. And therefore they realize they need to do much more edge analytics. And in the consumer space it's very different from industrial IoT. So security is still important, [inaudible 00:18:02] encryption and these things, but it's much, much easier than in the industrial world. And therefore, you can simply put one of these Hivecell boxes, for example, and put it there and connect it to your point of sale, and connect it to other sensors and APIs and [inaudible 00:18:18] or whatever.
And this is how it's typically running in like the retail stores and the restaurants. And I can give you a great example. So I'm always trying to use public examples where you can also read up more about that. And in the retail world, what I talk a lot about is what might surprise many people now is Royal Caribbean, the cruisers. So I always call this a swimming retail store. And this is actually what they are really doing. So what they are doing is deploying Kafka on each ship when they do a cruise. Right? Kafka is running there. And actually in that case, this is really mission critical though. This is a cluster of three brokers, because they cannot lose data there. Because as you know, on these cruise ships, the internet connectivity is very bad, and also very costly.
It's not good.
It's not good. And even if it's good, it's very expensive. Right? So what's happening here is Royal Caribbean has a Kafka cluster on each ship. And if you have used one of these journeys in the last years, you also know that you're more or less are forced to download their mobile app, because otherwise you cannot do a reservation and you cannot go to a theater and all these things. And in the end, the story is very similar to the Disney World and all these Disney parks, right? They analyze all your data and aggregate it and use it, to get a better customer experience, of course, for the end user. But also, on the other side, they upsell and they give you a better seats and they give you a coupon for a steak for a good reason, not just because they want to make you happy.
And all of these events are correlated on a ship in real time. It only works if you do it in real time. Right? And so what they integrate is on the one side the mobile apps, but also all of the backend systems on the ship. Again, this is the point of sale system, but this is also sensors, this is temperature, this is a real time inventory about the restaurant, about the seats in the cinema. And all of this is correlated with the ship because it needs to work offline. And that's, again, why Kafka is such a great tool here, because it's not just a real-time communication system like many people think about Kafka, but the big advantage is that it's also a storage system. And so you store all the events automatically, and handle the back pressure if a mobile app is offline. Because even on their ship their Wi-Fi is not perfect. Right?
And so when you're offline it's okay, because when you're online again, then you received the next event about a coupon or about the seat reservation. And this is how Royal Caribbean is doing this completely mission critical at the edge without internet connectivity. But then, and this is now where we're closing this, so when they go back into the harbor they have a big internet connection for a few hours. And then they replicate all of this data from each cruise to the cloud. And in the cloud they have a very big Kafka cluster, where then they aggregate the data from all the ships. And then they can do their analytics in the cloud, and machine learning, and all these things to find insights about, "Hey, how can I even more upselling for the customer?" And then they find new insights and deploy these insights back to the ship to the local business logic.
There you go. There you go. And that counts as edge because of the intermittent connectivity. I mean, I've never seen this on a cruise ship, and obviously it's been a little bit more than a year since many of us have been on a cruise ship, may that all return to normal soon. But I imagine there's something like a small data center, right? There's a room with conventional servers in it. And so that deployment inside the ship could look pretty normal on Linux, and it could be Kubernetes, could be Confluent deployment and all that kind of stuff, very conventionally. But yeah, the internet connectivity is nothing like conventional.
Exactly. So does this really why I said in the beginning it's so important to define the edge, because if you ask 10 people you get 12 answers. Right? And that's really true as well, because even if you ask me tomorrow maybe I give you a different answer. But therefore, more or less I have a checklist where I say, "Hey, these are the 10 arguments I have. And this is things the edge can be disconnected and offline. The edge can be autonomous. So it does work even if you have no connectivity to somewhere else, or if you have a downtime somewhere, you have business continuity, you have limited hardware." Often at the edge you don't have just one or two data centers, but you really have 100s or even 1000s of edge things or devices where you deploy Kafka. So that's really different definitions. And also then again, the other part about the edge often is just a single broker if it's not resilient. So there's many things that you can often combine a few of these characteristics, and therefore there is different kinds of edge use cases. That's really important.
And for each one of them you have different best practices and different deployment options and so on. So this is also very important to understand before you even think about deploying Kafka at the edge.
I guess this calls into sharp relief just how precisely controlled data centers are. Because sometimes they're specialized hardware, right? Like we talked about. Sometimes it's not, like a cell tower. There's a room there with computers in it. And a cruise ship, there's a room with computers. You'd recognize those computers. They don't look special to you. But in a data center there's like explicit, careful temperature and humidity controls and power continuity and guaranteed network connections that are redundant in all these ways. It's like this very precisely managed hot house that the parameters don't get to vary all that much. At the edge, all those parameters might vary. Could be environmental, could be network, could be power, could be cost and size constraints and all that stuff. So it's just not your fancy, precisely controlled data center. It's any other kind of computer out there in the world.
Exactly. And the other part, is often at the edge like we defined it, you also don't have IT experts. And that's the other big problem of that. And this brings me to my last part which I wanted to mention today here. And this is probably why so many people come to us as Confluent working on this, because running Kafka is not trivial. Right? And so I typically say Kafka is the engine of the car. And then Confluent in the cloud, it's a self-driving car because you don't do anything. And even if you have the Confluent platform, and we still provide you a lot of tool to operate that. So like the Confluent operator or proactive support that we even can take a look at your edge deployments. And then also the right tools like cluster linking to replicate between these edge sites and the cloud and all these things.
So we have a lot of products which we provide so that you really can build a resilient edge infrastructure without worrying too much about not having IT people at the edge. And this is exactly what's the real added value of bringing this in. And then, together with all these connectors we have, to both the legacy and the modern technologies, and providing the stream processing capabilities. And all of that's still fully managed with the operator, with self-balancing Kafka and these kind of things, with that it's much easier to deploy Kafka at the edge. Or even in general, a data processing solution at the edge. Because again, I mean in the cloud, it's actually a straightforward to also combine 10 different services. Like if you go to a cloud provider you have 10, 20 services, you combine them. Then you use some third party like Confluent, or maybe something like Flink which you run by yourself. In the cloud it's okay-ish.
But at the edge where you have no IT experts and so on, there you can do so many things just by deploying Confluent there. And that's the huge added value why we have so many conversations with customers across the industries.
My guest today has been Kai Waehner. Kai, thanks for being a part of Streaming Audio.
Thanks a lot, Tim.
And there you have it. Hey, you know what you get for listening to the end? Some free Confluent Cloud. Use the promo code 60PDCAST—that's 60PDCAST—to get an additional $60 of free Confluent Cloud usage. Be sure to activate it by December 31st, 2021, and use it within 90 days after activation. Any unused promo value after the expiration date is forfeit and there are a limited number of codes available. So don't miss out. Anyway, as always, I hope this podcast was useful to you. If you want to discuss it or ask a question, you can always reach out to me on Twitter @tlberglund, that's T-L-B-E-R-G-L-U-N-D. Or you can leave a comment on a YouTube video or reach out on Community Slack or on the Community Forum. There are sign-up links for those things in the show notes. If you'd like to sign up and while you're at it, please subscribe to our YouTube channel and to this podcast, wherever fine podcasts are sold. And if you subscribe through Apple podcasts, be sure to leave us a review there that helps other people discover it, especially if it's a five-star review. And we think that's a good thing. So thanks for your support, and we'll see you next time.
What is the internet of things (IoT), and how does it relate to event streaming and Apache Kafka®? The deployment of Kafka outside the datacenter creates many new possibilities for processing data in motion and building new business cases.
In this episode, Kai Waehner, field CTO and global technology advisor at Confluent, discusses the intersection of edge data infrastructure, IoT, and cloud services for Kafka. He also details how businesses get into the sticky situation of not accounting for solutions when data is running dangerously close to the edge. Air-gapped environments and strong security requirements are the norm in many edge deployments.
Defining the edge for your industry depends on what sector you’re in plus the amount of data and interaction involved with your customers. The edge could lie on various points of the spectrum and carry various meanings to various people. Before you can deploy Kafka to the edge, you must first define where that edge is as it relates to your connectivity needs.
Edge resiliency enables your enterprise to not only control your datacenter with ease but also preserve the data without privacy risks or data leaks. If a business does not have the personnel to handle these big IT jobs on their own or an organization simply does not have an IT department at all, this is where Kafka solutions can come in to fill the gap.
This podcast explores use cases and architectures at the edge (i.e., outside the datacenter) across industries, including manufacturing, energy, retail, restaurants, and banks. The trade-offs of edge deployments are compared to a hybrid integration with Confluent Cloud.
If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.Email Us