Every Flink datastream starts with a Source (or possibly more than one). This is the origin of the data. This data may be created programmatically, it may be read from a file or a database, or it may come from a streaming platform such as Apache Kafka. Depending on the nature of the source, the datastream might be finite it might be infinite. Understanding the difference can be important for implementing certain types of operations in the stream. This video will introduce a few different types of sources and discuss the difference between finite and infinite data streams.
DataStream<Integer> stream = env.fromElements(1,2,3,4,5);
DataGeneratorSource<String> source = new DataGeneratorSource<>( index -> "String"+index, numRecords, RateLimiterStrategy.perSecond(1), Types.STRING );
FileSource<String> source = FileSource.forRecordStreamFormat( new TextLineInputFormat(), new Path("<PATH_TO_FILE>") ).build();
KafkaSource<String> source = KafkaSource.<String>builder() .setProperties(config) .setTopics("topic1", "topic2") .setValueOnlyDeserializer(new SimpleStringSchema()) .build();
DataStream<String> stream = env .fromSource( source, WatermarkStrategy.noWatermarks(), "my_source" );
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.