git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Checkpointing when reading from files?


Hi Alex,

StreamingExecutionEnvironment#readFile is a helper function to create
file reader data streaming source. It uses
ContinuousFileReaderOperator and ContinuousFileMonitoringFunction
internally.

As both file reader operator and monitoring function uses
checkpointing so is readFile [1], you can go with first approach.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/api/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.html#readFile-org.apache.flink.api.common.io.FileInputFormat-java.lang.String-org.apache.flink.streaming.api.functions.source.FileProcessingMode-long-org.apache.flink.api.common.typeinfo.TypeInformation-


--
Thanks,
Amit


On Mon, May 21, 2018 at 9:39 PM, NEKRASSOV, ALEXEI <an4828@xxxxxxx> wrote:
> I want to add checkpointing to my program that reads from a set of files in
> a directory. Without checkpointing I use readFile():
>
>
>
>               DataStream<String> text = env.readFile(
>
>                            new TextInputFormat(new Path(inputPath)),
>
>                            inputPath,
>
>                           inputProcessingMode,
>
>                           1000);
>
>
>
> Should I use ContinuousFileMonitoringFunction / ContinuousFileReaderOperator
> to add checkpointing? Or is there an easier way?
>
>
>
> How do I go from splits (that ContinuousFileMonitoringFunction provides) to
> actual strings? I’m not clear how ContinuousFileReaderOperator can be used.
>
>
>
>               DataStreamSource<TimestampedFileInputSplit> split =
> env.addSource(
>
>                            new ContinuousFileMonitoringFunction<String>(
>
>                                          new TextInputFormat(new
> Path(inputPath)),
>
>                                          inputProcessingMode,
>
>                                          1,
>
>                                          1000)
>
>               );
>
>
>
> Thanks,
> Alex