IEEE Big Data 2017: 2nd CAS workshop

Workshop Title: Computational Archival Science: digital records in the age of big data

PART OF: IEEE Big Data 2017
*** There is a 1-day registration option ***

Important dates:

*** Oct 24, 2017: *** by “popular demand” Due date for full workshop papers submission
Nov 1, 2017: Notification of paper acceptance to authors
Nov 15, 2017: Camera-ready of accepted papers
Dec 11 – 14, 2017: Workshop on Wednesday, Dec. 13

Paper Submission Instructions
Full papers, of up to 10 pages, should be submitted via the conference workshop online submission system ( We also encourage submission of short papers (up to 6 pages) reporting work in progress. The submission deadline is 10 October 2017. All papers accepted will be included in the proceedings published by the IEEE Computer Society Press.

Introduction to workshop:
The large-scale digitization of analog archives, the emerging diverse forms of born-digital archive, and the new ways in which researchers across disciplines (as well as the public) wish to engage with archival material, are resulting in disruptions to transitional archival theories and practices. Increasing quantities of ‘big archival data’ present challenges for the practitioners and researchers who work with archival material, but also offer enhanced possibilities for scholarship through the application of computational methods and tools to the archival problem space, and, more fundamentally, through the integration of ‘computational thinking’ with ‘archival thinking’.

Our working definition of Archival Computational Science (CAS) is:

Contributing to the development of the theoretical foundations of a new trans-discipline of computer and archival science

This workshop will explore the conjunction (and its consequences) of emerging methods and technologies around big data with archival practice and new forms of analysis and historical, social, scientific, and cultural research engagement with archives. We aim to identify and evaluate current trends, requirements, and potential in these areas, to examine the new questions that they can provoke, and to help determine possible research agendas for the evolution of computational archival science in the coming years. At the same time, we will address the questions and concerns scholarship is raising about the interpretation of ‘big data’ and the uses to which it is put, in particular appraising the challenges of producing quality – meaning, knowledge and value – from quantity, tracing data and analytic provenance across complex ‘big data’ platforms and knowledge production ecosystems, and addressing data privacy issues.

This is the 2nd workshop at IEEE Big Data addressing Computational Archival Science (1st CAS workshop at: This will builds on three earlier workshops on ‘Big Humanities Data’ organized by the same chairs at the 2013-2015 conferences, and more directly on a symposium held in April 2016 at the University of Maryland (

Research topics covered:
Topics covered by the workshop include, but are not restricted to, the following:

  • Application of analytics to archival material, including text-mining, data-mining, sentiment analysis, network analysis.
  • Analytics in support of archival processing, including e-discovery, identification of personal information, appraisal, arrangement and description.
  • Scalable services for archives, including identification, preservation, metadata generation, integrity checking, normalization, reconciliation, linked data, entity extraction, anonymization and reduction.
  • New forms of archives, including Web, social media, audiovisual archives, and blockchain.
  • Cyber-infrastructures for archive-based research and for development and hosting of collections
  • Big data and archival theory and practice
  • Digital curation and preservation
  • Crowd-sourcing and archives
  • Big data and the construction of memory and identity
  • Specific big data technologies (e.g. NoSQL databases) and their applications
  • Corpora and reference collections of big archival data
  • Linked data and archives
  • Big data and provenance
  • Constructing big data research objects from archives
  • Legal and ethical issues in big data archives

Program Chairs:
Dr. Mark Hedges
Department of Digital Humanities (DDH)
King’s College London, UK

Prof. Richard Marciano
Digital Curation Innovation Center (DCIC)
College of Information Studies
University of Maryland, USA

Prof. Victoria Lemieux
School of Library, Archival and Information Studies
University of British Columbia, Canada

Program Committee Members:
The program chairs will serve on the Program Committee, as will the following:

Dr. Tobias Blanke
Department of Digital Humanities
King’s College London, UK

Dr. Maria Esteva
Data Intensive Computing
Texas Advanced Computing Center (TACC), USA

Dr. Bill Underwood
Digital Curation Innovation Center (DCIC)
College of Information Studies
University of Maryland, USA

Prof. Michael Kurtz
Digital Curation Innovation Center (DCIC)
College of Information Studies
University of Maryland, USA

Mark Conrad
Digital Curation Innovation Center (DCIC)
National Archives and Records Administration (NARA)
University of Maryland, USA

Additional program committee members will be added as required.

Invited Keynote Speakers:
We plan to invite keynote speakers and have a number of options. We also plan to have a closing panel session with invited speakers, to highlight emerging trends and issues and identify next steps.