12. Supply data for Crowd Sourced Cataloguing

The supply of bibliographic data to volunteers for the purpose of improving and enhancing records.

Description

Activity - The supply of bibliographic data to volunteers for the purpose of improving and enhancing records; this may also involve the creation of new records. Such activity may be directed (e.g. volunteers being requested to focus on specific fields, or to add new records for a specified collection) or it may be opportunistic (i.e. volunteers invited to edit and add as and when they see fit). This Use Case assumes this is being undertaken to benefit the initiating catalogue(s), though the resulting records will be designated as open data and therefore be available for wider use. It will not necessarily involve a third party coordinating organization, though there are attractions in terms of process, quality and web-scale critical mass in such services.

Services in which the collaborators are restricted to libraries (or other professional organizations) are similar in terms of open data but significantly different in other respects (see UC11).
This use case differs from UC11 because it is based on the opening up the cataloguing process itself whereas UC11 is about open data benefits in a closed cataloguing collaboration.
Part of use case: | Share your experience
Actors - Libraries; individuals and organisations that wish to make a contribution; optionally third party coordinating services
Part of use case: | Share your experience
Data involved - Most likely full (but possibly partial) bibliographic records
Part of use case: | Share your experience
Data flow - Depending on the software available, the methods of version control, the activity may take place within the catalogue or outside the catalogue (both cases using a browser-based application) or it may involve the supply of records (directly to volunteers or through open access). Providing open access to the resulting ‘crowd sourced’ records may best be treated as a separate process, especially if there is an interim QA step.
Part of use case: | Share your experience
Does this require Open Data - If the records are supplied under an open data license, the scope for exploitation will be unambiguous and the incentive for improvement will be greater.
Part of use case: | Share your experience
Current Examples - Biblios.net, Open Library
Part of use case: | Share your experience

Benefits

Institution - None other than upholding the principle of enhancing scholarship through improved discovery and description.
Part of use case: | Share your experience
Library Service - Potential for improved discovery and access, resulting in better use of the collection
Part of use case: | Share your experience
Researchers - Improved discovery, especially in disciplines (e.g. humanities) where historic and grey materials are poorly described, if at all.
Part of use case: | Share your experience
Students - Students may be less dependent on extensive description than researchers, though they could benefit from other aspects of metadata (such as enhanced links to courses or contemporary subject keywords)
Part of use case: | Share your experience
Replication - Release for cataloguing (whether directed or opportunistic) requires logistical considerations above and beyond more general release, though the licensing may be the same. The logistical process may however be organized as a variation of UC11, though UC11 is collaborative and therefore cannot take place within a single ‘home’ catalogue.
Part of use case: | Share your experience
Case for not doing it - The unpredictable costs of coordination and quality control may be judged to outweigh benefits, which might alternatively be accrued through collaborative cataloguing (see UC11)
Part of use case: | Share your experience

Motivation

Principles - Well-populated, high quality finding aids are highly desirable. The effort to improve them is costly and the domain knowledge is likely to be scattered. A distributed and open volunteer approach may therefore be advantageous, especially as the metadata does not itself confer competitive advantage.
Part of use case: | Share your experience
Costs - There are significant potential savings in cataloguing time, though this should be weighed against the overhead of coordination and quality assurance.
Part of use case: | Share your experience
Services - Improved services may result from better metadata and new services may result from wider description of the collection (assuming some libraries have significant resources incompletely described). Open data offers the freedom to pursue those opportunities.
Part of use case: | Share your experience
Rationale for not doing it - Issues of quality and authority are not insignificant and need to be faced head on.
Part of use case: | Share your experience

Consequences of doing it as Open Data

What will happen? - Unpredictably paced, variable quality records will require processing, the challenges of which can be addressed through a more directive approach to community engagement (see the community science work of Galaxy Zoo).
Part of use case: | Share your experience
Potential Risks - (1) Loss of control over institutional data; (2) Reduction in the quality and authority of catalogue records; (3) effort and resource required to make use of crowd source data outweighs benefits;
Part of use case: | Share your experience
Potential Opportunities - (1) Development of innovative / compelling third party services based on open data ; (2) Links with a broader engagement with User Generated Content linked to bib records, involving such as tagging, ratings, and reviews.
Part of use case: | Share your experience
Consequences of not doing it? - The sector needs to find some way of reducing the cost of cataloguing whilst addressing the growth in publication (i.e. of things to be catalogued). This is one option.
Part of use case: | Share your experience

Rights and Licensing Issues

Rights and licensing issues - In keeping with the principles behind this act, the license should be explicit and as open and unencumbered as possible in order to facilitate genuine reuse. See the general guidance on Licensing Issues for further detail
Part of use case: | Share your experience

Practicalities

Data exchange formatting - For editing, a web interface seems preferable. If a dataset is to be released for editing, it may be wise to release a standard set of attributes (not a full MARC record) in a format that can be handled by the user (e.g. CSV, XML). The availability of the resulting records as open data (in a range of formats) is best handled as a separate issue.
Part of use case: | Share your experience
Lifecycle implications - The lifecycle of such efforts is potentially complex. Planning of the synchronization and release of the outputs is especially important.
Part of use case: | Share your experience
Hosting requirements - The data to be edited may need hosting as a download or a separate catalogue instance. The resulting published data will need to be accessible for download.
Part of use case: | Share your experience
Existing systems impact - The LMS will not necessarily support the logistics of this approach, though MARC export and import options (for example) may be sufficiently flexible.
Part of use case: | Share your experience
Skills demands - Subject to the capabilities of the LMS, this will fall within the capabilities of a systems librarian. Depending on the approach, a cataloguing website may also be required which will need careful workflow engineering.
Part of use case: | Share your experience

Costs

Setup - The simplest implementation may be achieved using MARC export and import software, which should be part of the local LMS.
Part of use case: | Share your experience
Ongoing - The resource implications of managing the operational process must be addressed.
Part of use case: | Share your experience
Cost of doing nothing - No extra direct costs will be incurred by not doing it.
Part of use case: | Share your experience