Subject: Release of TREC Dynamic Domain: Polar Dataset


We have released our dataset collected from 2015-16 in the Polar Domain, called
the TREC Dynamic Domain Polar dataset.

Researchers interested in a rich dataset collected across the Scientific and
Deep web
can use mine HTML pages, PDF files, images, video, audio, and other formats for
scientific insights.

The data is described here:

And available from the NSF Arctic Data Center here:

If you use the dataset in your work, please consider citing it:

title={TREC Dynamic Domain: Polar Science.},
author={Burgess, Annie Bryant and Mattmann, Chris and Totaro, Giuseppe and
McGibbney, Lewis John and Ramirez, Paul M},

(our TREC paper, and/or the DOI from the actual dataset).


Chris Mattmann

Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, NSF & Open Source Projects Formulation and Development Offices (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-503
Email: [email protected]
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA


Programming list archiving by: Enterprise Git Hosting