RISM Data, Big Data
Thursday, September 8, 2016
The RISM online catalog currently contains over 1,052,000 records. When you count these along with 101,000 authority files for people, 63,000 for institutions, and 32,000 for secondary literature, that’s a lot of data!
All of it is freely available as linked open data under a Creative Commons license, but what can you do with it?
Our colleague Sandra Tuppen from the British Library and RISM UK, along with Stephen Rose and Loukia Drosopoulou, included RISM data in their project “A Big Data History of Music.” Insights gained from their project were published last year in Early Music, and this summer an article appeared in Fontes Artis Musicae that took a look at the role that bibliographic datasets – the basis for their data – played in their project:
Sandra Tuppen, Stephen Rose, and Loukia Drosopoulou, “Library Catalogue Records as a Research Resource: Introducing ‘A Big Data History of Music.’” Fontes Artis Musicae 63, no. 2 (April-June 2016): 67-88. DOI: 10.1353/fam.2016.0011
The datasets used were RISM’s data on printed music (series A/I and B/I) and music manuscripts (series A/II) and data from the British Library’s electronic and print catalogs (including Early Music Online). Combining RISM’s data, which the researchers called “the most comprehensive body of information on musical sources between ca. 1500 and 1800” (p. 70), with the British Library’s own extensive holdings of music published in Britain, Ireland, and abroad resulted in a dataset of over two million records.
The researchers describe what analyses were performed on the data, and using such a big dataset reveals interesting ways of looking at music history. For example, they compared publications of Palestrina’s sacred music in counter-Reformation cities with the number of publications in Rome and Venice. With a network graph they could also show the relationship between RISM’s genre terms and composers as evidenced by surviving printed music before 1800.
The RISM data are now available as open data and linked open data and the British Library made its data available through their Free Data Services page. The Fontes article describes how the data from these bibliographic records had to be cleaned up and unified. Microsoft Excel was the main tool used to manipulate the data though some visualizations were achieved using some tools developed by our colleagues at RISM Switzerland.
Anyone is welcome to take our data out for a spin. If you do so, we’d love to hear about it!
Image: From the British Library’s open data
Share Tweet EmailCategory: New publications