June 29, 2021 | Episode 165

Data-Driven Digitalization with Apache Kafka in the Food Industry at BAADER

  • Transcript
  • Notes

Coming out of university, Patrick Neff (Data Scientist, BAADER) was used to “perfect” examples of datasets. However, he soon realized that in the real world, data is often either unavailable or unstructured. 

This compelled him to learn more about collecting data, analyzing it in a smart and automatic way, and exploring Apache Kafka® as a core ecosystem while at BAADER, a global provider of food processing machines. After Patrick began working with Apache Kafka in 2019, he developed several microservices with Kafka Streams and used Kafka Connect for various data analytics projects. 

Focused on the food value chain, Patrick’s mission is to optimize processes specifically around transportation and processing. In consulting one customer, Patrick detected an area of improvement related to animal welfare, lost revenues, unnecessary costs, and carbon dioxide emissions. He also noticed that often machines are ready to send data into the cloud, but the correct presentation and/or analysis of the data is missing and thus the possibility of optimization. As a result:

  • Data is difficult to understand because of missing units
  • Data has not been analyzed so far
  • Comparison of machine/process performance for the same machine but different customers is missing 

In response to this problem, he helped develop the Transport Manager. Based on data analytics results, the Transport Manager presents information like a truck’s expected arrival time and its current poultry load. This leads to better planning, reduced transportation costs, and improved animal welfare. The Asset Manager is another solution that Patrick has been working on, and it presents IoT data in real time and in an understandable way to the customer. Both of these are data analytics projects that use machine learning.

Kafka topics store data, provide insight, and detect dependencies related to why trucks are stopping along the route, for example. Kafka is also a real-time platform, meaning that alerts can be sent directly when a certain event occurs using ksqlDB or Kafka Streams.

As a result of running Kafka on Confluent Cloud and creating a scalable data pipeline, the BAADER team is able to break data silos and produce live data from trucks via MQTT. They’ve even created an Android app for truck drivers, along with a desktop version that monitors the data inputted from a truck driver on the app in addition to other information, such as expected time of arrival and weather information—and the best part: All of it is done in real time.

Continue Listening

Episode 166July 8, 2021 | 29 min

Automated Event-Driven Architectures and Microservices with Apache Kafka and SmartBear

Automated behavioral-driven testing of your event-driven microservices—sounds great, right? But how do you do it? SmartBear's Alianna Inzana shares about just that and some tooling that SmartBear makes to support efforts like this. In fact, SmartBear is actually responsible for many products that you're probably familiar with.

Episode 167July 15, 2021 | 25 min

Powering Real-Time Analytics with Apache Kafka and Rockset

Using large amounts of streaming data increasingly requires interactive, real-time analytics and dashboards—and this applies to any industry, including tech. CTO and Co-Founder of Rockset Dhruba Borthakur shares how his company uses Apache Kafka to perform complex joins, search, and aggregations on streaming data with low latencies. The Kafka database integrations allow his team to make a cloud-native analytics database that is a fundamental piece of enterprise infrastructure.

Episode 168July 22, 2021 | 29 min

Consistent, Complete Distributed Stream Processing ft. Guozhang Wang

Stream processing has become an important part of the big data landscape as a new programming paradigm to implement real-time data-driven applications. One of the biggest challenges for streaming systems is to provide correctness guarantees for data processing in a distributed environment. Guozhang Wang (Distributed Systems Engineer, Confluent) contributed to a leadership paper, along with other leaders in the Apache Kafka community, on consistency and completeness in streaming processing in Apache Kafka in order to shed light on what a reimagined, modern infrastructure looks like.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.