Can you please be more specific about your environment and what you have found to be out of date please?
Problem resolved. The crawl script and web documentation are out of date. Nutch script works fine.

Might be a good idea to update sitemap related documentation at some point... takes quite a bit of speculation and experimentation right now...



Dear fellow Nutch developers,

I've been trying to use Nutch 2 sitemap function to crawl and index all pages on the sitemap indices. It seems that integration with CommonCrawler sitemap tools only exist in 2.x branch. But after I got it to work with Hbase 1.2.3, it didn't fetch, parse and index the sitemap indices and sitemaps at all.

I also looked into the code a bit and everything seems to make sense, except I couldn't further trace the data flow beyond in the FetchReducer. I'm testing it on Linux with the "crawl" script in /bin, so I'm not sure if how I can debug this. Please let me know if there's any further information that I can provide you with to help troubleshoot this issue. Thanks in advance!

