Scientia Arctica: A Knowledge Archive for Discovery and Reproducible Science in the Arctic
Investigators at the University of California Santa Barbara, the NSF-funded DataONE at the University of New Mexico, and NOAA (through the University of Maryland) will build and operate a multi-institutional knowledge archive serving diverse Arctic disciplines, including ecology, earth science, atmospheric science, oceanography, anthropology, archaeology, and social and political science. The archive will provide the capability to preserve and enable discovery of all products of NSF Arctic Science Section funded research, including data, metadata, software, documents, and provenance that link these in a coherent knowledge model, using infrastructure from the DataONE federation of data repositories. This cooperative agreement will support a comprehensive archive of data from Arctic research funded by the NSF Arctic Sciences section, which necessitates a system that can handle the complexity, heterogeneity, and volume of data generated in the Arctic. By building upon successful repository infrastructures that already handle these kinds of data the repository will come online at project inception, be immediately useful to researchers, and undergo continuous improvement through feature expansions and refinement. Novel features in the archive will enable critical linkages between data, science, and policy. A data science fellowship program, training workshops, and a short course will provide interdisciplinary training opportunities for at least 190 graduate and undergraduate students. Selection will promote underrepresented groups with the goal to increase the diversity of data scientists in the research community. Students returning to their home universities will help communicate data science issues and will effect a large change across science disciplines nationally. In addition, the new infrastructure and data will have a lasting impact on Arctic research and policy globally. Arctic science has policy and societal importance that are disproportionally large, partly because of the impacts of climate change in the Arctic, and partly because of the resource implications on the oil, gas, and other industries that are affected by these changes. Digital research data will serve as a foundation for new insights that impact both science and society. Data storage will be accomplished utilizing the UC Santa Barbara KNB Data Repository (KNB, formerly the Knowledge Network for Biocomplexity) for its versioning and accessioning to enable an effective archive. Data will be replicated to administratively diverse institutions at the KNB, the Centers for Environmental Information (NCEI) and the Amazon cloud, as this is critical to long-term preservation. DataONE researcher-facing tools will be adapted to provide convenient pathways to document and archive diverse data formats as part of scientists' normal workflow. This infrastructure will be supported with an outstanding set of community services, including data discovery tools, metadata assessment and editing, data cleansing and integration, data management consulting, and user help-desk services. A data recovery team will engage the community to prioritize and rescue critical Arctic data from past NSF research that is currently inaccessible. In addition to the traditional functions of a data archive, modern cloud-based data facilities will support detailed provenance tracking of the science process, data usage and citation reporting, linkages among heterogeneous disciplines, and direct linkages between the literature, investigators, and funding programs. Usability and outreach specialists will engage an interdisciplinary Arctic Science Advisory Board and the broader polar science communities to drive continuous improvement by evaluating tools and services, gathering requirements and use cases, and prioritizing improvements to infrastructure and service offerings of the archive. Usage data and usability studies will drive an iterative cycle of assessment and development to improve operations.