Staff Software Practice Lead
The Confluent Stream Lineage is a tool that allows us to see, at a glance, how our data streams fit together to form larger pipelines. It combines details from our producers, consumers, and topics to give an overall picture of the system. It also integrates with various metrics to allow us to dive deeper into any individual component. In this video, we'll introduce you to the Stream Lineage. We'll show you how to set up a simple lineage and teach to you interpret some of the data it provides.
Topics:
Once our system reaches a certain scale, understanding how the pieces fit becomes challenging. It's too big of a problem for a human to solve, and even if we could there's no guarantee we'd be operating with current data. Thankfully, the Confluent Stream Lineage is always working to solve the puzzle by mapping the data flow as it happens. Because the Stream Lineage is always on there's no need to enable it or configure it. From the moment we start pushing data into the first topic, the lineage is working. At any point, we can select the Stream Lineage in the Confluent Cloud Console and see how the data is flowing through the streams. Often when working with components in Confluent Cloud there is a link for "See in Stream Lineage". Selecting that link will take us directly to the relevant component in the lineage. However, this is a lineage of the data. If there's no data, then the system can't build a lineage. That means if we want to see the lineage we need to start producing data. Once data is flowing, the lineage will begin to populate. It doesn't happen instantly, so be patient. Here we see a simple lineage containing a Producer, Topic and Consumer. Each element of the lineage is given the appropriate labels. The client ID for the producer and consumer, as well as the name of any topics will also appear. It's important that we name our components wisely. Each element is connected by arrows representing the flow of data. However, the size of the arrows is different. It represents the relative number of messages flowing through each stage in the pipeline. The difference in size suggests there may be an imbalance. This may not be anything to worry about. The imbalance may be temporary as our system reacts to changes in load, or we may have operations that alter the number of messages at a particular stage. But sometimes it can signal a real problem. In this case, our consumer should process every message. This is a message in, message out situation. The fact that there is an imbalance warrants a deeper look. In situations like this it's often a good idea to look at the consumer lag. Thankfully, we can jump straight to the consumer by clicking it. From there, we can select the "Consumers" tab to see the lag. Our producer is creating an average of one message per second. The recorded lag of 87 messages implies our consumer is 87 seconds behind. That's a lot. This suggests we have found a bottleneck in our system. From here, we'd want to investigate deeper and either try to improve the efficiency of the consumer, or scale out with more instances. Here we see an expanded pipeline with the bottleneck removed. We can see that our payment service is now pushing data to new topics: PaymentSucceeded and PaymentFailed. Our lineage shows this as a branch in the flow. We can have both fan-in branches and fan-out branches. In this case, we are seeing a fan-out branch. Our consumer has now been relabeled as a custom app, because it is both a producer and a consumer, and once again we see arrows of different sizes. In this case, between the two new topics. However, we aren't going to worry about this imbalance. We expect our system to produce more successes than failures, therefore the imbalance is what we want. Of course, if the imbalance suddenly reversed so that we saw more failures than successes it would be unexpected and we'd want to look into it. Clicking on a topic gives us a variety of useful information, including the schema. This can help us understand how information flows from one topic to the next. For example, if we were auditing the use of personal information in our system we might look at the order created topic, select the schema and see that it contains the tag PII for Personally Identifiable Information. However, downstream schemas don't contain that tag. Therefore, in this case our PII is not being propagated downstream. Each element in the lineage contains a variety of metrics. This includes things like the number of producers, consumers, throughput, and consumer lag. This information is critical to understanding the system. It's worth taking some time to become familiar with what is available. One powerful feature of the lineage is the ability to rewind and see what it looked like in the past. Point-in-time lineage allows us to customize the time window we are viewing. By default, the lineage shows data for the last 10 minutes, but with Point-in-time Lineage, we can adjust that to a variety of different periods going back minutes, hours, or even days. We can use this to look back at the lineage and see how it has changed over time. For example, if we notice that one of the streams is unbalanced, we can rewind the lineage to see if that is normal or a new development. Of course, all of this assumes we can locate the portion of the lineage we are interested in. The lineage search feature allows us to quickly locate specific elements and jump directly to them. This can be incredibly valuable in a large lineage with many moving parts. For example, in a system with hundreds, or even thousands of producers how would you find the specific one you were looking for? With lineage search enabled, it becomes trivial. The human brain is capable of finding and recognizing visual cues, but we need to give it data to work on. Taking a few moments out of every day to look at the Stream Lineage can build a baseline in our minds. The next time we open the lineage, if something is abnormal we may be able to recognize a change in the pattern. When we see these types of unexpected changes it's well worth our time to investigate. We may have just discovered an unreported issue, or even a security concern. Using the lineage to develop this kind of deeper understanding can help us avoid future data governance issues. If you aren't already on Confluent Developer, head there now using the link in the video description to access the rest of this course and its hands-on exercises.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.