Software Practice Lead
This video features a recap of the highlights of this course, and it includes pointers to where you can find us and resources for learning more.
Topics:
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Hey, I'm David from Confluent, and we've reached the end of this course. We've been on a journey together. Let's take a moment to review where we've been. We started by looking at the role of a stream processor in providing rich, responsive experiences for customers. If you start thinking about all of the things that happen in the course of operating a business, you realize that there are many potential sources of interesting event data, and many potential use cases for that data, whether it be to provide customers with better experiences, or to make backend operations more efficient. To realize this potential, we need a framework that takes away the challenges of implementing a scalable, fault-tolerant distributed system, and lets us focus on the business logic that defines our use cases. We saw that SQL is a surprisingly good fit for implementing stream processing. Data transformations, filters, aggregations, and joins are easily expressed in SQL, and are sufficient to meet the needs of many applications. We saw that some data processing operations, such as counting, need to keep some state. I described how Flink keeps this state local, both to make the state highly scalable, and for good performance. This ensures that the state will survive the failure of any of the Flink processing nodes. Regular snapshots are written asynchronously to a remote, distributed file system. We had a look at how these snapshots are organized, with each stateful operator writing into the snapshot a copy of its own, local state. This is done in a coordinated manner, so that the entire snapshot is self-consistent. We also talked about the role of watermarks, which is to enable time-based operations, like windowing, to produce their results when the time is right. Otherwise, when a stream is out-of-order, it can be difficult to know how long to wait for events that may, or may not, still arrive. In this course we've only just scratched the surface of what Flink SQL is capable of. Here are some of the more powerful features that we didn't have time for. Joins are incredibly useful, especially for data enrichment. The support in Flink SQL for working with change data capture streams lets you easily enrich your realtime event streams with data from relational databases. And implementing pattern matching use cases with match recognize is ridiculously fun and easy. The Apache mailing lists are a good place to connect with other Flink users and to get help. There's also a very active Apache Flink Slack Instance you can join. You'll find links to these resources on the Apache Flink website. At Confluent we're always creating more Flink related content, so stop by the Confluent developer website to learn more. In particular, an excellent way to continue learning more about Apache Flink is to check out this course on Building Apache Flink Applications in Java, which you'll find alongside this Apache Flink 101 course on Confluent developer. We also have some great tutorials showing how to handle specific use cases with Flink, such as this one on finding the min and max in an event stream. The Confluent blog also has excellent content on a wide range of topics related to stream processing, such as this blog post that takes a lighthearted approach to explaining the differences between at least one's, at most one's, and exactly one's guarantees. That wraps up this course on Apache Flink. Thank you for your interest. If you aren't already on Confluent developer head there now using the link in the video description to access other courses, hands-on exercises, and many other resources for continuing your learning journey.