Ownership and rights with respect to the humble catalogue record remain complex. The collegiate and interconnected nature of the library community has led to a culture of limited sharing over many years; a culture of sharing that today makes it extremely difficult to reliably assign provenance to individual records or their constituent parts. The involvement of commercial data supply organisations, stakeholders such as the British Library, and non-profits such as OCLC further complicate an interconnected set of ownerships and permissions.
Libraries today exist in an environment where they, their stakeholders, and new commercial and non-commercial third parties all have an interest in exposing and repurposing bibliographic data. Whilst it is certainly feasible to enlist large groups to ‘crowd source’ new and legally unencumbered content as OpenStreetMap has done for mapping, it appears more sensible to build upon the legacy of rich bibliographic data already maintained in libraries here and overseas.
The mood outside the library community is shifting. Governments, standards bodies, web companies and our potential beneficiaries are increasingly embracing a presumption in favour of transparency, openness, sharing and reuse. Except in specific circumstances such as those around personally identifiable information, there is a growing belief that raw data should be freely available; especially where it has been collected using public funds. Adoption of this approach in specific communities can be seen through activity such as the ‘Panton Principles‘ for providing open access to scientific data. Actions taken within the library community must respect these shifting attitudes, and assess risks and opportunities not only on the basis of of current institutional priorities, but also in the light of the changing world of which we must remain a part.
In January 2011 a set of principles for Open Bibliographic data were proposed on Openbiblio.net. The recommendations below, while written before the Openbiblio principles were published, are broadly inline with these principles.
Despite the ever-present risk that some subset of a resource may not be yours to give away, and despite the (probably) vanishingly small possibility that the rightful owner is both able to prove their claim and wishes to seek legal redress, it appears increasingly difficult for public institutions to use such reasoning as justification for not opening up access to non-personal data for use and reuse.
Universities should proceed on the presumption that their bibliographic data will be made freely available for use and reuse.
Check-lists such as those offered in the recent JISC Legal guidance should be used in order to quantify risk in specific circumstances, but the default position should be for disclosure unless evidence presented in a risk assessment overwhelmingly argues otherwise.
Releasing data for use by third parties without explicit statements as to permissible usage is unhelpful, and increases friction within the system by forcing responsible reusers to contact the source in order to request explicit permissions. The absence of a license does not mean that copy rights, database rights and other rights have been waived, whatever the implied intentions of the data owner.
Use of a license enables the source to unambiguously grant a set of permissions, and enables potential users to assess whether or not their intended use is permitted. Neither source nor beneficiary is required to waste effort in further negotiation, unless some special use not covered by the license is intended.
There are a wide range of licenses available in this space, with those from groups such as Creative Commons and the Open Data Commons being the best known. A growing range of variants exist, developed by specific projects or communities to cater for various perceived special requirements. In a small number of circumstances, these variants are necessary, but in all too many cases the variations are minor and unnecessary. All that the variant licence ends up doing is increasing the costs to potential beneficiaries. They may, for example, accept and understand the terms of a Creative Commons Attribution License, and can therefore freely and easily consume content licensed in that way, wherever they encounter it on the web. The introduction of variant terms into a ‘Creative Commons-like license’ from a single institution may require those potential beneficiaries to pay for legal advice in order to understand the implications of the variation. Even without such a cost, the value of seeing and understanding a single license across the web is lost, as every minor variation encountered increases the likelihood that the different licenses will conflict when combined in some third party use case.
In the vast majority of circumstances, institutions should use a Creative Commons Attribution License (CC-BY) to encourage reuse of copyrightable material. For collections of factual data, the Open Data Commons Public Domain Dedication and License (ODC-PDDL) should be used.
A growing number of institutions have released data with explicit licenses. The University of Cambridge released 180,000 MARC21 records in October 2010 using ODC-PDDL, and the British Library released three million records from the British National Bibliography the following month using CC0.
Licenses such as those from Creative Commons make it possible to select from a set of permissible activities. Content is always licensed under Creative Commons to require attribution of the source, for example. Other options include the ability to explicitly prohibit reuse for commercial purposes. Although this might appear beneficial from the perspective of the HE sector, the relationship between commercial and non-commercial use is complex. Use of library data by a commercial organisation such as LibraryThing would clearly be prohibited. So, too, might use in a wide range of academic circumstances in which an institution might be perceived to profit commercially.
Do not use ‘Non-Commercial’ licenses, unless you really understand the universe of possibilities that you’re restricting.
Do not develop your own license that ‘looks a bit like’ Creative Commons. Every difference, every oddity, every reworded clause increases the cost of re-use.
If UK Higher Education identifies some significant problem with the existing licenses, the sector could and should engage with the existing process to change them.
Despite any normal due diligence procedure, the small risk remains that a third party will challenge an institution’s rights with respect to part or all of a dataset that they have made openly available online. By developing and displaying a suitable takedown policy, the institution provides a mechanism for complaint and a clear set of procedures to follow in resolving any dispute.
Develop and display a takedown policy alongside data being made openly available.
There do not currently appear to be sufficiently explicit examples of good practice in this area. We look forward to revising this Guide as these become more visible.
Takedown policies need to be fair and transparent. A balance needs to be struck between the rights of the (alleged) data owner and the rights of the (alleged) infringer, and the policy should be drafted in such a way as to prevent malicious takedown requests being fulfilled.