May 5, 2022 | Episode 213

Streaming Analytics on 50M Events Per Day with Confluent Cloud at Picnic

  • Transcript
  • Notes

What are useful practices for migrating a system to Apache Kafka® and Confluent Cloud, and why use Confluent to modernize your architecture?

Dima Kalashnikov (Technical Lead, Picnic Technologies) is part of a small analytics platform team at Picnic, an online-only, European grocery store that processes around 45 million customer events and five million internal events daily. An underlying goal at Picnic is to try and make decisions as data-driven as possible, so Dima's team collects events on all aspects of the company—from new stock arriving at the warehouse, to customer behavior on their websites, to statistics related to delivery trucks. Data is sent to internal systems and to a data warehouse.

Picnic recently migrated from their existing solution to Confluent Cloud for several reasons:

  • Ecosystem and community: Picnic liked the tooling present in the Kafka ecosystem. Since being a small team means they aren't able to devote extra time to building boilerplate-type code such as connectors for their data sources or functionality for extensive monitoring capabilities. Picnic also has analysts that use SQL so appreciated the processing capabilities of ksqlDB. Finally, they found that help isn't hard to locate if one gets stuck.
  • Monitoring: They wanted better monitoring; specifically they found it challenging to measure for SLAs with their former system as they couldn't easily detect the positions of consumers in their streams.
  • Scaling and data retention times: Picnic is growing so they needed to scale horizontally without having to worry about manual reassignment. They also hit a wall with their previous streaming solution with respect to the length of time they could save data, which is a serious issue for a company that makes data-first decisions. 
  • Cloud: Another factor of being a small team is that they don't have resources for extensive maintenance of their tooling.

Dima's team was extremely careful and took their time with the migration. They ran a pilot system simultaneously with the old system, in order to make sure it could achieve their fundamental performance goals: complete stability, zero data loss, and no performance degradation. They also wanted to check it for costs.

The pilot was successful and they actually have a second, IoT pilot in the works that uses Confluent Cloud and Debezium to track the robotics data emanating from their automatic fulfillment center. And it's a lot of data, Dima mentions that the robots in the center generate data sets as large as their customer events streams. 

Continue Listening

Episode 214May 11, 2022 | 49 min

Scaling Apache Kafka Clusters on Confluent Cloud ft. Ajit Yagaty and Aashish Kohli

How much can Apache Kafka scale horizontally, and how can you automatically balance, or rebalance data to ensure optimal performance? You may require the flexibility to scale or shrink your Kafka clusters based on demand. With experience engineering cluster elasticity and capacity management features for cloud-native Kafka, Ajit Yagaty (Confluent Cloud Control Plane Engineering) and Aashish Kohli (Confluent Cloud Product Management) join Kris Jenkins in this episode to explain how the architecture of Confluent Cloud supports elasticity.

Episode 215May 17, 2022 | 6 min

Apache Kafka 3.2 - New Features & Improvements

Apache Kafka 3.2 delivers new KIPs in three different areas of the Kafka ecosystem: Kafka Core, Kafka Streams, and Kafka Connect. On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent), shares release highlights

Episode 216May 19, 2022 | 33 min

Practical Data Pipeline: Build a Plant Monitoring System with ksqlDB

Apache Kafka isn’t just for day jobs according to Danica Fine (Senior Developer Advocate, Confluent). It can be used to make life easier at home, too! Building out a practical Apache Kafka® data pipeline is not always complicated—it can be simple and fun. For Danica, the idea of building a Kafka-based data pipeline sprouted with the need to monitor the water level of her plants at home. In this episode, she explains the architecture of her hardware-oriented project and discusses how she integrates, processes, and enriches data using ksqlDB and Kafka Connect, a Raspberry Pi running Confluent's Python client, and a Telegram bot. Apart from the script on the Raspberry Pi, the entire project was coded within Confluent Cloud.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free