[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

CSVSplitter - Splittable DoFn

Hi All,

I noticed that there is no support for CSV file reading (e.g. rfc4180) in Apache Beam - at least no native transform. There's an issue to add this support:

I've seen examples which use the apache commons csv parser. I took a shot at implementing a SplittableDoFn transform. I have the full code and some questions in a gist here:

I suspect it could be improved quite a bit. If anyone has time to provide feedback I would really appreciate it. 


Peter Brumblay
Fearless Technology Group, Inc.