On 22 March 2021, the Heritage Connector project hosted a two-hour online workshop. The objectives of the workshop were to:
- introduce Heritage Connector affordances and possibilities to a model group of historians and curators,
- introduce concerns and interests of the model group of historians and curators to the Heritage Connector team and digital humanities practitioners,
- run exercises to create mutual learning.
Over fifty participants attended, including collections management professionals, curators and archivists from the Science Museum Group’s five museums, the V&A and other museums across the UK; Wikipedia professionals; academics from digital humanities, history and other disciplines; community-based historians and practitioners, and the project team.
The workshop included the following three presentations from the project team:
John Stack, Introduction to Heritage Connector Project
Kalyan Dutia and Jamie Unwin: Heritage Connector Technical Introduction
Dr Jane Winters: Heritage Connector and Digital Humanities
Activity 1: 100 Assistants
After demoing the work-in-progress Heritage Connector software, workshop participants were invited to explore opportunities for the techniques outlined in their fields of work and/or research. To shape the discussion and help frame the possibilities of using artificial intelligence (AI), knowledge graphs and entity extraction, with cultural heritage collections, participants were asked to respond to the question “What large-scale repetitive tasks could be given to non-subject matter experts?” They were asked to imagine what tasks could be usefully undertaken by 100 such assistants.
Responses from the participants fell into a number of categories:
- Link-building between resources e.g. between trade directories and collection catalogues, or cross-referencing between sources
- Advanced discovery e.g. being able to search by alternative terms or identify references to the same people across multiple records
- Surfacing related content e.g. related projects, blog posts, articles etc.
- Collection management e.g. helping to cleanup taxonomy terms
- Drawing on existing structured data e.g. relationships in richly cataloged archival collections
- Geolocation e.g. finding content related to a specific place (Note: Geolocation is the subject of one of the other Towards a National Collection Foundation Projects: Locating a National Collection led by the British Library)
- Visual search e.g. similar-looking objects, handwriting recognition, identifying copyright stamps, things depicted (Note: Visual search is the subject of one of the other Towards a National Collection Foundation Projects: Deep Discoveries led by the National Archives)
Activity 2: Hope and Fears
The workshop wrapped up with a second activity which invited participants to outline their hopes and fears for the technology and approaches outlined in the session. The responses are summarised below.
- Richer interlinking of collections data e.g. thematic connections, previously unknown connections, cross-collection links, etc.
- Usable by audience e.g. allowing researchers to help themselves, fit into the ways that historians work
- Improving public access e.g. new forms of access for broad audiences as well as researchers
- Tools for collection managers e.g. allows resources to be deployed more effectively; increased understanding of the collection, cleaning and aligning terminology, etc.
- Opportunity to address bias e.g. visualising absence and therefore being able to address areas of absence and bias
- Potential to quantify uncertainty e.g. modeling level of certainty in links and knowledge implied by them
- Usability is problematic e.g. accessing the content might be difficult to use or results opaque
- Absence or unreliability of data e.g. collection datasets are incomplete, thin or patchy; and potential instability, inaccuracies or gaps in Wikidata and other sources
- Inherent bias in results e.g. approach perpetuates or amplifies existing biases; AI needs to be understood as a tool and not a solution, and machine learning training data need to be understood
- Work in preparing data e.g. need for human cleaning of data
- Environmental sustainability e.g. creation and storage of data
- Longevity e.g. reliance on research grants means approach and data not sustained
- Scalability e.g. approach remains out of reach for small organisations, reinforcing the bias towards large, well known, collections
Thanks to Rhiannon Lewis for organising the event; Sameena Allie for technical setup and support; Jamie Unwin, Kalyan Dutia and Dr Jane Winters for presenting; Dr Tim Boon; and especially to the participants who attended the workshop.