About CAS

The “COMPUTATIONAL ARCHIVAL SCIENCE (CAS)” Portal
http://dcicblog.umd.edu/cas
Join our Google Group at: computational-archival-science@googlegroups.com
cropped-esteva.jpg
What is Computational Archival Science (CAS)? An initial working definition:

A transdisciplinary field concerned with the application of computational methods and resources to large-scale records/archives processing, analysis, storage, long-term preservation, and access, with aim of improving efficiency, productivity and precision in support of appraisal, arrangement and description, preservation and access decisions, and engaging and undertaking research with archival material.

Our initial CAS operational definition is:

Contributing to the development of the theoretical foundations of a new transdiscipline of computer and archival science.

Objectives:
1. Contribute to the development of the theoretical foundations of a new
trans-discipline of computer and archival science

2. Design the educational foundations and delivering training in this
emerging trans-discipline to support all industries and fields

3. Develop a virtual and physical laboratory to test and apply scientific
advances in a collaborative environment

Table of Contents:

1. CAS Workshops 2. CAS Presentations/Courses
3. CAS Publications 4. CAS Infrastructure
5. Education & Training


1. Fifteen International CAS Workshops:

2018:

  • Workshop #15: The National Archives (UK) and KCL Workshop: Computational Archival Science: Automating the Archive::
    Sep. 7 — Mark Hedges et al., The National Archives, Kew, UK.
    See: https://blog.nationalarchives.gov.uk/blog/computational-archival-science-automating-archive/
    and
    See: http://dcicblog.umd.edu/cas/9-7-2018_uk-tna-kcl_cas-workshop/

    Exploring how computational approaches can be used to support archival practice in the creation and preservation of reliable and authentic records and archives, but also taking into account users of archives, how access and interaction can be supported and enhanced.
    As secondary objectives:
    – identify and evaluate current trends, requirements, potential, and risks in the field, and examine the consequences and questions that may arise
    – determine possible research agendas and collaborations for the evolution of the field in the coming years
    – establish a community of practice for developing collaborative projects, and liaising with the wider international community in the field.

  • Workshop #14: Annual Society of American Archivists Conference (SAA2018), Washington D.C. — “Community Engagement Workshop: Integrating Archival Education with Technology and Research”:
    Aug. 15 — See flyer at: http://dcicblog.umd.edu/cas/wp-content/uploads/sites/13/2018/12/SAA_workshop_flyer18.pdf
    SAA-Sponsored workshop organized by Harvard Library (Daina Bouquin) and the DCIC (M. Kurtz, B. Underwood, K. Fenlon, and R. Marciano)

    To explore the possibility of collaboration among archival educators to share techniques, strategies, and tools to develop and enhance the skills of students in academic and professional education programs. The DCIC wishes to share some of its capabilities, learn from colleagues in the field, and foster a discussion on opportunities for collaboration in digital curation and computational treatments of archival collections in particular. We believe that meeting off-site at a digital curation lab, and allowing time to present and discuss, would be a beneficial way to add value to the interests of archival educators

  • Workshop #13: Annual Association of Canadian Archivists (ACA2018), Edmonton, Canada — Plenary Session: “New Modalities of Archival Exploration”
    Jun. 7, 2018 — Chair: Luciana Duranti, UBC. Speakers:

    • Mark Hedges, Senior Lecturer, Digital Humanities, King’s College London
    • Richard Marciano, Professor in the College of Information, University of Maryland
    • Ian Milligan, Associate Professor of History, University of Waterloo


    The last several years has seen archival fonds emerge as “big” data corpora. This process of “data-fication” of the archives has given rise to new modalities of archival exploration using data science techniques, such as data mining, feature extraction, machine-learning based clustering and classification, and network analytics. This panel discusses their experience with new methods of archival exploration and some of the associated theoretical, methodological, and practical challenges for archivists and researchers alike.

  • Workshop #12:
    Workshop on Cyberinfrastructure and Machine Learning for Digital Libraries and Archives, in Conjunction with the Joint Conference on Digital Libraries (JCDL2018)

    Jun. 3, 2018, in Fort Worth, TX. See: https://www.tacc.utexas.edu/conference/jcdl18

  • Workshop #11: CAS Planning Meeting, UMD:
    May 9, 2018 — Planning and development meeting on Computational Archival Curricula Development with participation from the iSchool, University of British Columbia, King’s College London, and Georgia Tech: CAS curriculum, Research, Conferences, Grants.
    May9-2018_CASpic
  • Workshop #10: Kuyshu University, Japan, Invitational workshop on CAS, Jan. 12 – Jan. 16, Maria Esteva, Richard Marciano. See: http://dcic.umd.edu/symposium-computational-archival-science-kyushu-university-japan-jan-12-16-2018/
    CAS_pic

    • “Articulating Computational Archival Science (CAS): Background, Current State, and Professional and Educational Implications” (Marciano, Esteva)
    • “The Scope of Computational Archival Science (CAS): Methods, Resources and Interdisciplinary Approaches” (Marciano, Esteva)
    • “World War II Japanese-American Internment Camp Project” (Marciano)
    • “Anatomy of Big Archives Visualization” (Esteva).

2017:

  • Workshop #9: Harvard Library Computational Archival Science Unconference 2017, Dec. 14, 2017 — LINK: https://projects.iq.harvard.edu/hlcas2017
    • “Diving into Computational Archival Science”, May 8, 2018, Jane Kelly, SAA Electronic Records Section Blog, Click CLICK HERE
  • meetingroom

  • Workshop #7: Lifecycle Management and Digital Preservation Using Blockchain Technology DLM Forum/ARMA Triennial Conference, Brighton, UK, September 13-15, 2017 (organizer and presenter: Vicki Lemieux.
  • Workshop #6: Privacy, Security, Trust and Blockchain Technology IEEE 26th International Conference on Computer Communications and Networks (ICCCN), Vancouver, BC, July 31-August 3, 2017 (VL organizer and presenter
  • Workshop #5: CAS Planning Meeting, UMD — May 9-10, 2017
    Planning and development meeting with participation from the iSchool, University of British Columbia, Texas Supercomputing Center, and Georgia Tech: CAS curriculum, Research, Conferences, Grants.
  • cropped-DCIC_no_caption.jpg

2016:

  • Workshop #3: IEEE Big Data 2016 “1st CAS Workshop”, Washington D.C. — Dec. 6, 2016
    Keynote talk and 10 presentations (Belgium, Germany, UK, Canada, USA), Panel, Breakout and Poster sessions. Participants from universities, government agencies, and companies.
    LINK: http://dcicblog.umd.edu/cas/ieee_big_data_2016_cas-workshop/
  • Screen Shot 2018-05-15 at 8.16.38 AM

  • Workshop #1: Finding New Knowledge: Archival Records in the Age of Big Data, UMD — Apr. 26-28, 2016
    LINK: http://dcicblog.umd.edu/cas/dcickcl-invited-cas-symposium-apr-2016/

    • A KCL / UMD symposium to explore and define the possibilities of CAS, with 52 participants including:
      • federal representatives (White House, NSF, NEH, IMLS, NIH, NARA)
      • researchers (iSchool, CS, Journalism, Libraries, Humanities)
      • students (doctoral, master’s, and high-school)
      • cultural institutions (Smithsonian, National Gallery, US Holocaust Memorial Museum)
      • consortia
    • Objectives and Scope:
      • Address the challenges of big data for digital curation, with a focus on archival records, cultural materials, and humanities research.
      • Explore the conjunction of emerging digital methods and technologies around big data and their consequences for generating new forms of analysis and historical research engagement with archival material.
      • Identify and evaluate current trends, requirements, and potential in the field, to examine their consequences and the new questions that the field can provoke.
      • Determine possible research agendas for the evolution of the field in the coming years.
      • Establish a community of practice going forward to develop research agendas and collaborative projects.

    Screen Shot 2018-05-15 at 7.50.35 AM


2. CAS Presentations:

2018:

  • Future Technologies Conference, Vancouver, Canada, November 29-30, 2018, “Blockchain and Distributed ledger as Trusted Recordkeeping Systems: An Archival Theoretic Evaluation Framework” (Victoria Lemieux presenter).
  • Plenary Session at ACA2018: Association of Canadian Archivists Annual Conference on Truths, Trust, and Technology, Edmonton, AB, Canada, June 7, 2018
    • “New Modalities of Archival Exploration”
      Speakers: Mark Hedges, KCL; Richard Marciano, UMD; and Ian Milligan, U. Waterloo
      Description: The last several years have seen archival fonds emerge as “big” data corpora. This process of “data-fication” of the archives has given rise to new modalities of archival exploration using data science techniques, such as data mining, feature extraction, machine-learning based clustering and classification, and network analytics. The speakers of this panel discuss their experience with new methods of archival exploration and some of the associated theoretical, methodological, and practical challenges for archivists and researchers alike.
  • iConference 2018: Sheffield, UK. Paper accepted [Myeong Lee et al.], “Toward Identifying Values and Tensions in Designing a Historically-Sensitive Data Platform: A Case-Study on Urban Renewal”

2017:

  •  Executive Women’s Forum on Information Security Risk Management 7 Privacy, Scottsdale, Arizona, October 24-26 (VL presenter).
    • Blockchain Security: An Overview
  • Mid-Atlantic Region Archivists Conference (MARAC) Buffalo, October 26-27-
    • IRP2 panel (Michael Kurtz et al)
  • Launch of the Maryland State Archives & UMD iSchool’s “Legacy of Slavery Program Collaboration” Buffalo, October 9, 2017
    • Michael Kurtz et al, at the MSA in Annapolis
  • Screen Shot 2018-05-15 at 8.31.21 AM

  • Summer 2017 Blockathon University of British Columbia: August 3, 2017
    • UBC’s “Blockathon” for Social Good Research, Vicki Lemieux. Open to local and global community members, focused on applying decentralized protocols to improve real-world research processes. See: http://blockchainubc.ca/2017/05/30/blockathon/
  • ISGC2017 (International Symposium on Grids and Clouds 2017), Academia Sinica, Tapei Taiwan: March 8, 2017
    • “The Emergence of Computational Archival Science (CAS)”, Closing keynote
  • IMLS “Always Already Computational” project, UC Santa Barbara: Mar. 2, 2017
    • A three-day workshop on using library collections as data
    • “On the Computational Turn in Libraries and Archives”, Invited talk and position paper
  • Screen Shot 2018-05-15 at 8.28.06 AM

  • Maryland State Archives (MSA): Jun. 15, 2017
    • Joint planning on developing computational treatments for the Legacy of Slavery archives

2016:

  • CNI Fall 2016 (Center for Networked Information), Washington, D.C.: Dec. 13, 2016
    • “DRAS-TIC Measures: Digital Repository at Scale that Invites Computation (To Improve Collections)”, Talk
  • MARAC 2016 (Mid-Atlantic Regional Archives Conference), Annapolis, MD: Nov. 4, 2016
    • “Practical Digital Curation Skills for Archivists in the 21st Century”, Invited talk
  • Digital Preservation 2016, Milwaukee, WI: Nov. 10, 2016
    • “Designing Scalable Cyberinfrastructure for Metadata Extraction in Billion-Object Archives”, Talk & paper
  • iPres 2016 (13th International Conference on Digital Preservation), Bern, SWI: Oct. 4, 2016
    • “Designing Scalable Cyberinfrastructure for Metadata Extraction in Billion-Record Archives”, Talk & paper
  • SAA 2016 (Society of American Archivists annual conference) – Top voted-in pop-up session, Atlanta, GA: Aug. 5, 2016
    • “Archival Records in the Age of Big Data”, SAA Member selected session
  • NAGARA 2016 (National Association of Government Archives & Records Administrators annual conf.), Lansing, MI: Jul. 15, 2016
    • “New Developments in Electronic Records”, Invited session
  • LOC Saving the Web 2016 (Library of Congress), Washington D.C.: Jun. 16, 2016
    • “Preserving the Web in the Age of Big Data”, Invited talk
  • Archiving 2016, Washington D.C.: Apr. 21, 2016
    • “Revealing Hidden Archival Patterns”, Talk & paper


3. CAS Publications:

2018:

  • “The Enhanced ‘International Research Portal for Records Related to Nazi-Era Cultural Property” Project (IRP2): A Continuing Case Study by Michael Kurtz, Greg Jansen, and Richard Marciano.
  • “Mapping Inequality: ‘Big Data’ Meets Social History in the Story of Redlining,” in The Routledge Companion to Spatial History, Richard Marciano et al. Eds: Ian Gregory, Don Debats, Don Lafreniere.

2017:

  • Chapter in Future Archives, Victoria Lemieux
  •  Towards Automated Quality Curation of Video Collections from a Realistic Perspective, University of Texas at Austin,Todd Goodall, Maria Esteva, Sandra Sweat, Alan C. Bovik
  •  “Evaluating the Use of Blockchain in Land Transactions: An Archival Science Perspective,” in European Property Law Journal, Victoria Lemieux (forthcoming)
  • Association of Canadian Archivists 2017 Annual Meeting: Jun. 9, 2017
    • “Disrupting Archival Education [making a case for Computational Science]”, Victoria Lemieux, Ottawa CA.
  • National Forum Position Paper: Mar. 3, 2017
    • “On the Computational Turn in Archives & Libraries and the Notions of Levels of Computational Services”, Invited talk and position paper
    • Link: http://dcicblog.umd.edu/cas/wp-content/uploads/sites/13/2016/05/AlwaysAlreadyComputationalPositionStatement.pdf
      At the UC Santa Barbara IMLS workshop on “Always Already Computational: Library Collections as Data”, the goals of which are to: (1) articulate computationally amenable library collection use cases, (2) initiate a collection of best practices that support developing, describing, and providing access to computationally amenable library collections.


4. CAS Cyberinfrastructure:

  • Launch of the DRAS-TIC software initiative
    • DCIC/UMD negotiates re-assignment and ownership of the Alloy / Indigo industry software from Archive Analytics Solutions Ltd. (AAS) on Sep. 30, 2016 (with support from UMD Offices of IT Procurement, General Counsel, and Technology Commercialization). AAS transfers ownership to UMD (after roughly $2M in investments in Indigo).
    • DCIC launches the DRAS-TIC Open Source software initiative on Oct. 4, 2016. See: http://dcic.umd.edu/10032016-introducing-open-source-platform-dras-tic/. DRAS-TIC is digital repository software to manage content at scale.
    • CNI Video: https://vimeo.com/206243022

    DRASTIC

    Digital Repository At Scale · That Invites Computation [To Improve Collections]
    ————————————-
    DRAS·TIC /ˈdrastik/
    adjective:
    likely to have a strong or far-reaching effect; radical and extreme.
    synonyms:
    extreme, serious, radical, far-reaching, impactful, momentous, substantial.
    ————————————-
  • DRAS-TIC Demos: technology demos to Federal and State Agencies, & International audiences
    • DH2017 (Digital Humanities), Montreal Canada: Aug. 7, 2017
      • “Shaping Humanities Data: Use, Reuse, and Paths Toward Computationally Amenable Cultural Heritage Collections”, Invited demo: “DRAS-TIC & Reusable Computational Processing of Large-scale Digital Humanities Collections”
    • Smithsonian Institute, UMD: Jun. 26, 2017
      • Joint planning on using CAS with big collections
    • Maryland State Archives (MSA): Jun. 15, 2017
      • Joint planning on developing computational treatments for the Legacy of Slavery archives
    • National Archives & Records Administration (NARA): 2016 & 2017
      • DRAS-TIC Demos:
        • IT Engineering Team: May 2016 & Apr. 19, 2017
        • Office of Innovation: Jul. 20, 2017


5. Education & Training

Fall 2017: As part of a graduate seminar on Computational Archival Science (CAS)