git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Checkpointing when reading from files?


I want to add checkpointing to my program that reads from a set of files in a directory. Without checkpointing I use readFile():

 

              DataStream<String> text = env.readFile(

                           new TextInputFormat(new Path(inputPath)),

                           inputPath,

                          inputProcessingMode,

                          1000);

 

Should I use ContinuousFileMonitoringFunction / ContinuousFileReaderOperator to add checkpointing? Or is there an easier way?

 

How do I go from splits (that ContinuousFileMonitoringFunction provides) to actual strings? I’m not clear how ContinuousFileReaderOperator can be used.

 

              DataStreamSource<TimestampedFileInputSplit> split = env.addSource(

                           new ContinuousFileMonitoringFunction<String>(

                                         new TextInputFormat(new Path(inputPath)),

                                         inputProcessingMode,

                                         1,

                                         1000)

              );

 

Thanks,
Alex