Subject: Re: parse-zip Nutch 2.x compatibility?

Maybe with processGzippedXML() from Crawler-Commons? Is this possible?



On 08/01/2017 05:21 PM, Michael Chen wrote:
Dear all,

I was trying to parse .xml.gz sitemaps with Nutch 2.x, but couldn't build the parse-zip plugin. parse-ext, parse-swf and feed also failed to build. It seems to be a known issue (NUTCH-874) and is marked for version 2.5.

Is there a workaround to parse gunzipped files? Is the porting of these plugins under active development?

Thank you!


Programming list archiving by: Enterprise Git Hosting