Workshop summary – Two birds, one stone: Bridging cultural heritage collections with crowds and niches

On Monday the 31st of October the workshop entitled “Two birds, one stone: Bridging cultural heritage collections with crowds and niches”, was held at the Netherlands Institute for Sound and Vision. The workshop was divided in two sessions: presentations and a practical session. In the first session, cultural heritage institutions gave a presentation about their experiences with crowdsourcing, while the practical session involved testing of the systems presented.

The first presentation was by Maarten Brinkerink from the Netherlands Institute of Sound and Vision and was titled “What’s That? Video Tagging Games for Audiovisual Heritage Collections”. In his presentation, Maarten stressed the importance of enriching one of the vast audiovisual collection in Europe that Netherlands Institute of Sound and Vision holds, not only by professional annotations, but also by using the crowd. One such crowdsourcing initiative is the “Waisda? ” crowdsourcing game where users are able to annotate videos online, the goal being the consensus between players. Slides and talk.

Next, Sander Pieterse from Naturalis and Xeno-canto Foundation for Nature Sounds in his presentation “Every Feather and Song: Crowdsourcing and Co-curation from a Natural History Perspective” emphasized crowdsourcing as a significant tool for “building a big collection together” and ensuring enrichment of the existing natural history collections. Moreover, he showed how crowdsourcing can be used as a way of networking between amateurs and professionals and how it can build a sense of connectedness within the communities it addresses. Slides and talk.

Saskia Scheltjens from Rijksmuseum Amsterdam in “Accurator: Consolidation and Integration of Annotations” presented the Accurator system, a project done in collaboration with VU University Amsterdam. Accurator is used for annotating artworks in Rijksmuseum, but despite its usefulness, the results are still to be integrated in the collections at the museum due to the various restructuring that is currently taking place within Rijksmuseum. Slides and talk.

The last presentation was held by Chris Dijkshoorn from VU University Amsterdam and was titled “DigiBird: on the fly collection integration using crowdsourcing”. He presented the results of DigiBird, a project that reinforces crowdsourcing initiatives and integrates four distinct nature-related collections. He mentioned how crowdsourcing is evolving to be a valuable approach to collect data, but faces challenges regarding sustainability and use of results. Slides and talk.

After these presentations, followed a practical session where the participants tried out the crowdsourcing systems presented: Accurator and Waisda?, together with the DigiBird platform. In the scope of the DigiBird project, an instance of the Waisda? game was created with a selection of videos that contain birds, while in the Accurator system a selection of artworks from the bird domain was selected. On the DigiBird platform the participants could see not only on the fly integration of results from the crowdsourcing systems presented, but also results from platforms like Xeno-canto and Naturalis and general statistics of the integrated platforms, together with real-time updates of annotations for artworks and videos containing birds.

Two birds, one stone: Bridging cultural heritage collections with crowds and niches

Crowdsourcing is a valuable source of data and metadata in cultural heritage. In this workshop speakers from Naturalis, the Netherlands Institute for Sound and Vision and the Rijksmuseum will provide their insights on crowdsourcing by discussing the following points:

  • Relevance: Why is crowdsourcing relevant to my institution
  • Tooling: Which tools do we use
  • Lessons learned: What are the lessons we learned

Sustainability is one of the problems encountered by many crowdsourcing projects. In the fourth talk we discuss how we approached this problem in the DigiBird project, which integrates multiple initiatives and shows how results can be leveraged for collection integration.

Date & venue

14:00-17:00 Monday 31st of October, at Sound and Vision, Media Parkboulevard 1, Hilversum

Program

Session 1

14:00-14:20 What’s That? Video Tagging Games for Audiovisual Heritage Collections
Maarten Brinkerink, Netherlands Institute for Sound and Vision
14:20-14:40 Every Feather and Song: Crowdsourcing and Co-curation from a Natural History Perspective
Sander Pieterse, Naturalis
14:40-15:00 Break

Session 2

15:00-15:20 Accurator: Consolidation and Integration of Annotations
Saskia Scheltjens, Rijksmuseum Amsterdam
15:20-15:40 DigiBird: on the Fly Collection Integration using Crowdsourcing
Chris Dijkshoorn, VU University Amsterdam

Session 3

15:40-17:00 Practical session: try out the crowdsourcing systems

Registration

Participation in the workshop is free of charge. Places are limited, which is why we invite you to register.
Register here

For questions regarding the workshop you can send a mail to cristina.bucur@student.vu.nl, more information about the DigiBird project can be found on http://www.digibird.org/.

This workshop is supported by the Dutch national program COMMIT/.

Querylog-based Assessment of Retrievability Bias in Delpher

On March 17, we were invited by the National Library of the Netherlands to present the results of our study on retrievability bias in the Dutch historic newspaper archive.
The research was conducted in collaboration with the WebART project and will be presented at the Joint Conference on Digital Libraries (JCDL) 2016 in Newark, USA, in June 2016.

Summary of the talk:

Search engines are not “objective” pieces of technology, and bias in Delpher’s search engine may or may not harm user access to certain type of documents in the collection. In the worst case, systematic favoritism for a certain type can render other parts of the collection invisible to users. This potential bias can be evaluated by measuring the “retrievability” for all documents in a collection. We explain the ideas underlying the retrievability metric, and how we measured it on the KB Newspaper collection.  We describe and quantify the retrievability bias imposed on the newspaper collection by three different commonly used Information Retrieval models. For this, we investigated how document features such as length, type, or date of publishing influence the retrievability.
We also investigate the effectiveness of the retrievability measure, featuring two characteristics that set our experiments apart from previous studies: (1) the newspaper collection contains noise originating from OCR processing, and historical spelling and use of language; and (2) rather than the simulated queries used in other studies, we use real user query logs including click data. We show how simulated queries differ from real user queries regarding term frequency and prevalence of named entities, and how this affects the results of a retrieval task.

Slides:

Stitch by Stitch: Annotating Fashion at the Rijksmuseum

Rijksmuseum – Modemuze – COMMIT/ SealincMedia – Wikimedia Nederland

Saturday 23rd April 2016 – Cuypers Library, the Rijksmuseum

Fashion heritage, collected over centuries, can be found everywhere in museums: costumes, accessories, paintings, prints and photographs. But while some clothes and accessories are easily found and identified, others are obscure and require a trained eye to describe. What are we looking at? What kind of sleeve is this? Which materials and techniques have been used? More specific descriptions of the images facilitate better use of digital collections and enable users to wander through them in detail.
The Rijksmuseum and Modemuze are looking for specialists and enthusiasts with a passion for fashion and costume to join an expedition through their digital collections.

Modemuze is an online platform and network of 11 Dutch museums, including Rijksmuseum, with a fashion and costume collection: Amsterdam Museum, Centraal Museum Utrecht, Fries Museum Leeuwarden, Gemeentemuseum Den Haag, Museum Rotterdam, Paleis Het Loo, Rijksmuseum, Tassenmuseum Hendrikje, TextielMuseum, Theatercollectie Bijzondere Collecties UvA, Tropenmuseum, Afrika Museum, Museum Volkenkunde.

Annotating the collections

Researchers from VU University Amsterdam, Delft University of Technology and the Centre for Mathematics and Informatics and the Rijksmuseum (in the context of the COMMIT/ SEALINCMedia project) have developed Accurator: an online tool to improve the process of annotation of digital collection objects, e.g. being able to find relevant objects to annotate, annotate specific parts of an object, etc. Following ‘Birdwatching in the Rijksmuseum’, this time the Accurator tool will be used to describe fashion related objects from the Modemuze and Rijksmuseum collections.

Participants in the fashion annotation event are also invited to record their findings in the Wikipedia Encyclopedia, Wikimedia Commons and in Wikidata, Wikipedia’s open database. Wikipedia volunteers as well as staff from the Rijksmuseum and Modemuze will be present for support throughout the day.

Program

9.30 – 10:00 Registration and Coffee
10:00 – 10:10 Introduction of the Accurator tool
10:10 – 12:00 Annotating fashion in the digital collections (using Accurator)
12:00 – 12:30 Discussion on the use and future of fashion annotation

Participation in the event is free, but registration is required. To register, please send an email to: accurator@rijksmuseum.nl with your name and your interest in fashion. (We will take your subject preferences into account when setting up the Accurator tool.) If you have any questions regarding the event, please feel free to email them to this address.

IMPORTANT: The event will take place in the Cuypers Library of the Rijksmuseum. The following guidelines need to be taken into account:

  • On the 23rd of April you can report at the RIJKSMUSEUM desk in the Atrium. Please bring your confirmation of registration. Without this, entry can be denied.
  • Bring your own laptop. There are strict safety guidelines in the Library, which limit the use of laptop power supplies. Please make sure the battery is fully loaded to last for 3 hours without further charging.

www.accurator.nl (general info)
annotate.accurator.nl (annotation tool)
www.modemuze.nl
www.rijksmuseum.nl
wm.cs.vu.nl (Web & Media group at VU)

DigiBird kickoff meeting

On the 5th of February 2016 the kickoff meeting for the COMMIT/ valorisation project DigiBird took place. The meeting was hosted by the Netherlands Institute for Sound and Vision (Nederlands Instituut voor Beeld en Geluid). During the meeting, the people who will work on the project were introduced, together with the partners involved.

The DigiBird project builds on the results of the SEALINCMedia project, aiming to use crowdsourcing results to integrate three different media types: images, sounds and videos – all related to birds. The various datasets that belong to these different media types are provided by the partners involved in the project. Most of these platforms already use crowdsourcing as a means of annotating the bird media, but there is no single point of access for all of them and no means of crossover access. Thus, the goal of DigiBird is to achieve this integration by creating cross-links between collections and designing user-friendly interfaces. These will not only help to enable access to the various bird collections, but will also motivate people to contribute more knowledge by means of annotations.

The people who will work on developing this project are Chris Dijkshoorn – a PhD student and Cristina-Iulia Bucur – a student assistant, both affiliated with VU University Amsterdam.

The partners involved in DigiBird are:

During the meeting, a hands-on breakout session took place. During this session, the participants from the various partners could create their own view on how the interfaces could look and also how the user interaction can be dealt with by building various scenarios.

Impact Analysis of OCR Quality on Research Tasks in Digital Archives

We presented our paper on “Impact Analysis of OCR Quality on Research Tasks in Digital Archives” at this year’s International Conference on Theory and Practice of Digital Libraries (TPDL2015).

We describe how humanities scholars currently use digital archives and the challenges they face in adapting their research methods compared to using a physical archive. The required shift in research methods has the cost of working with digitally processed historical documents. Therefore, a major concern for the scholars is the question how much trust they can place in analyses based on noisy representations of source texts.

Based on interviews with humanities scholars and a literature study, we classify scholarly research tasks according to their susceptibility to errors originating from OCR-induced biases. Search results for “Amsterdam”, for example, are likely to be influenced by the confusion of the letters “s” and “f”, especially for material that was created before 1800, when the “long s” was still used.
In order to reduce the impact of such errors, we investigated which kind of data would be required for this and whether or not it is available in the archive.

We describe our study of example research tasks performed on the digital newspaper archive of the National Library of The Netherlands. In this study, we tried to reduce the uncertainty of the results as much as possible with the data publicly available in the archive.

We conclude that the current knowledge situation on the scholars’ side as well as on the tool makers’ and data providers’ side is insufficient and needs to be improved.

Birdwatching Rijksmuseum

Rijksmuseum – Naturalis Biodiversity Center – Wikimedia Nederland & COMMIT SealincMedia present a unique birdwatching event

Birds are everywhere. In your own garden, in nature, and also in art. Among the Rijksmuseum’s 1,2 million collection objects are many prints, paintings and artefacts that have bird species depicted on them. Among the 37 million objects in the Naturalis collection are many birds from all over the world that have been collected in the last 200 years, as well as historical drawings of plants and animals in which many birds are depicted.

Wikimedia Commons

Some of the depicted birds are easily identified. Others require a trained eye to determine which species the artist has pictured. The Rijksmuseum and Naturalis are looking for experienced bird watchers and other avian enthusiasts to join an expedition through their digital collections and help the museums identify bird species in works of art.

The Rijksmuseum and Naturalis are currently in the process of donating large parts of their digitized collections of bird images to Wikimedia Commons, Wikipedia’s open multimedia library. Participants of the birdwatching day are challenged to collaboratively identify as many bird species depicted on these images as possible and record these in Wikimedia Commons and in Wikidata, Wikipedia’s open database.

Accurator

For this purpose COMMIT/SealincMedia,  a consortium of Dutch researchers from the VU University Amsterdam, Delft University of Technology and the Centrum Wiskunde & Informatica (centre for mathematics and informatics), has developed a dedicated online tool for the Rijksmuseum. With this tool, called Accurator, common and scientific names of species depicted in artworks can be recorded in an intuitive way. Participants of the birdwatching day will use this tool to tag bird species. Wikipedia volunteers as well as curators from the Rijksmuseum and Naturalis will be present for support throughout the day.

During the birdwatching day the RijksmuseumNaturalis, Wikimedia Netherlands (the organization behind the Dutch version of Wikipedia) and the COMMIT/SealincMedia researchers want to learn how we can best collect your knowledge as a bird enthusiast and apply it to enrich our art collections. We also hope to learn how we can make Accurator more user-friendly.

Can’t wait? Here’s a preview of birds in the Rijksmuseum! https://www.rijksmuseum.nl/nl/zoeken?f.publish.apiCollection=BIRDS
You can start adding information using Accurator here: http://annotate.accurator.nl

Program

10.00 Start
10.10 Introduction and presentation Erik Hinterding, birdwatcher and curator Rijksmuseum
10.30 Presentation Steven van der Mije, head Vertebrate collections Naturalis
10.50 Introduction editing Wikipedia
11.15 Introduction Accurator
11.30 Start edit-a-thin
13.30 Wrap-up by Erik Hinterding
14.00 End

14.30 Tour (optional, registration required)

Registration

Participation is free but due to the limited number of available places registration is required and can be done via https://goo.gl/3rSKNa. For questions, mail us at vogelen@rijksmuseum.nl.

Important

The event will take place in the Cuypers Library of the Rijksmuseum. The following guidelines need to be taken into account:

1. No food, drinks or smoking allowed.
2. Due to the limited number of available places registration in advance is required. On the 4th of October you can report at the desk in the Atrium. Please bring your confirmation of registration. Without this, entry can be denied.
3. If possible, bring your own laptop but there are very strict safety guidelines in the Library. Therefore it is not possible to use your own power supply or adaptor unless your device is not older than a year (please bring proof of purchase). Make sure the battery is fully loaded.