Report on Project Workshop (22 March 2021)

Introduction

On 22 March 2021, the Heritage Connector project hosted a two-hour online workshop. The objectives of the workshop were to:

introduce Heritage Connector affordances and possibilities to a model group of historians and curators,
introduce concerns and interests of the model group of historians and curators to the Heritage Connector team and digital humanities practitioners,
run exercises to create mutual learning.

Over fifty participants attended, including collections management professionals, curators and archivists from the Science Museum Group’s five museums, the V&A and other museums across the UK; Wikipedia professionals; academics from digital humanities, history and other disciplines; community-based historians and practitioners, and the project team.

Presentation recordings

The workshop included the following three presentations from the project team:

John Stack, Introduction to Heritage Connector Project

Download John Stack’s slides (PDF)

Kalyan Dutia and Jamie Unwin: Heritage Connector Technical Introduction

Download Kalyan Dutia and Jamie Unwin’s slides (PDF)

Dr Jane Winters: Heritage Connector and Digital Humanities

Activity 1: 100 Assistants

After demoing the work-in-progress Heritage Connector software, workshop participants were invited to explore opportunities for the techniques outlined in their fields of work and/or research. To shape the discussion and help frame the possibilities of using artificial intelligence (AI), knowledge graphs and entity extraction, with cultural heritage collections, participants were asked to respond to the question “What large-scale repetitive tasks could be given to non-subject matter experts?” They were asked to imagine what tasks could be usefully undertaken by 100 such assistants.

Responses from the participants fell into a number of categories:

Link-building between resources e.g. between trade directories and collection catalogues, or cross-referencing between sources
Advanced discovery e.g. being able to search by alternative terms or identify references to the same people across multiple records
Surfacing related content e.g. related projects, blog posts, articles etc.
Collection management e.g. helping to cleanup taxonomy terms
Drawing on existing structured data e.g. relationships in richly cataloged archival collections
Geolocation e.g. finding content related to a specific place (Note: Geolocation is the subject of one of the other Towards a National Collection Foundation Projects: Locating a National Collection led by the British Library)
Visual search e.g. similar-looking objects, handwriting recognition, identifying copyright stamps, things depicted (Note: Visual search is the subject of one of the other Towards a National Collection Foundation Projects: Deep Discoveries led by the National Archives)

Activity 2: Hope and Fears

The workshop wrapped up with a second activity which invited participants to outline their hopes and fears for the technology and approaches outlined in the session. The responses are summarised below.

Hopes

Richer interlinking of collections data e.g. thematic connections, previously unknown connections, cross-collection links, etc.
Usable by audience e.g. allowing researchers to help themselves, fit into the ways that historians work
Improving public access e.g. new forms of access for broad audiences as well as researchers
Tools for collection managers e.g. allows resources to be deployed more effectively; increased understanding of the collection, cleaning and aligning terminology, etc.
Opportunity to address bias e.g. visualising absence and therefore being able to address areas of absence and bias
Potential to quantify uncertainty e.g. modeling level of certainty in links and knowledge implied by them

Fears

Usability is problematic e.g. accessing the content might be difficult to use or results opaque
Absence or unreliability of data e.g. collection datasets are incomplete, thin or patchy; and potential instability, inaccuracies or gaps in Wikidata and other sources
Inherent bias in results e.g. approach perpetuates or amplifies existing biases; AI needs to be understood as a tool and not a solution, and machine learning training data need to be understood
Work in preparing data e.g. need for human cleaning of data
Environmental sustainability e.g. creation and storage of data
Longevity e.g. reliance on research grants means approach and data not sustained
Scalability e.g. approach remains out of reach for small organisations, reinforcing the bias towards large, well known, collections

Acknowledgements

Thanks to Rhiannon Lewis for organising the event; Sameena Allie for technical setup and support; Jamie Unwin, Kalyan Dutia and Dr Jane Winters for presenting; Dr Tim Boon; and especially to the participants who attended the workshop.