Computational Finding Aids (original 10/17/17)
Selected Blog by Claire McDonald
I had planned to continue on the theme of oral histories and how the idea of “computational finding aids” might apply. As I delved more deeply into the topic, however, I struggled with the concept of “computational finding aids,” so this week’s blog has more of an educational takeaway flavor. Here are some of the questions I have been asking myself: When are finding aids no longer finding aids? When are they no longer relevant? How are “computational findings aids” different from already established data structures, data dictionaries, or interfaces that help users understand and access data?
Marciano et al (2017) cite the SAA definition of a finding aid as “a tool that facilitates discovery of information within a collection of records” (p. 9). But SAA seems wedded to the idea of finding aid as a document that includes specific contextual content. The Notes section of the SAA glossary states, “Finding aid is a single document that places the materials in context by consolidating information about the collection, such as acquisition and processing; provenance, including administrative history or biographical note; scope of the collection, including size, subjects, media; organization and arrangement; and an inventory of the series and the folders” (https://www2.archivists.org/glossary/terms/f/finding-aid).
When you’re dealing with big data, say a million tweets, is there really a way to put the corpus in context, describe the scope of the collection, or understand the provenance? As Maemura, Becker, and Milligan (2016) found when exploring Web archives, “The scale . . . confounds the traditional appraisal process” (p.1). The authors further state, “Working with web archival material presents researchers with opportunities for developing new approaches and methods of analysis, often because existing methods do not translate to them. For example, exploring web archives does not require navigating a finding aid, and collections can be sorted and filtered in multiple ways, resulting in multiple arrangements” (p. 2).
If CAS is to be a new trans-discipline that requires archivists to acquire new skills and think differently about archival data, is it necessary to continue using the language and applying the principles of the paper paradigm of archives? In a broad sense, the tools researchers use to understand and visualize data structures could be called finding aids, but I think using this term potentially creates more confusion and perhaps implies a level of trust or validity in data that cannot be fully assessed. Furthermore, with unstructured data, such as tweets, is it even possible to “browse the hierarchical archival tree,” as Marciano et al suggest (p. 9). From reading the Maemura, Becker, and Milligan paper, it seems that exploring the non-hierarchical relationships would provide richer insights and support new and different ways of interacting and understanding archival data. My last blog, which did address oral histories, identified some computational methods (natural language processing and social network analysis) to do just this.
Sorry to post a question-filled blog, and perhaps I’m just arguing semantics, but I am trying to understand how “computational finding aids” are different or unique from tools that are already in existence for exploring large data sets.
Higgins S., Hilton C., and Dafis L. “Archives context and discovery: rethinking arrangement and description for the digital age.” In ICa Second annual Conference, girona. 2015. [This paper offers some interesting thoughts on archival description and hierarchical structure in the digital realm.]
Maemura E., Becker C., and Milligan, I. “Understanding computational web archives research methods using research objects.” In Big Data (Big Data), 2016 IEEE International Conference on, pp. 3250-3259. IEEE, 2016.
Marciano, R., Lemieux, V., Hedges, M., Esteva, M., Underwood, W., Kurtz, M., and Conrad, M. (In press). Archival Records and Training in the Age of Big Data. Advances in Librarianship – Re-Envisioning the MLIS: Perspectives on the Future of Library and Information Science Education, eds. Sarin, L.C., Percell, J., Jaeger, P.T., & Bertot, J.C