Research Data Management
In 2014-2018, I completed a PhD in the Digital Humanities with Ethnomusicology, where I spent hundreds of hours trawling through the records of the Seán Ó Riada Collection, a special collection at the Boole Library in University College Cork. During my PhD, I spent some time focusing on creating a series of datasets, which comprised of data from the library, but also archival material at:
The Irish Newspaper Databases (Digitised newspapers available in the Irish Times and Weekly Irish Times archive and the Irish Newspaper Archive)
The National Folklore Collection at University College Dublin (UCD)
The Radio Éireann “rolls” at Raidió Teilifís Éireann (RTÉ)
The Irish Film Institute
These archival materials allowed me to gather metadata on the activities of Seán Ó Riada, and in particular his creative output (essentially every project that he was involved with during his career).
The result was a series of datasets that allowed me to create digital visualisations of the material that was in the Seán Ó Riada Collection, but just as importantly, information on projects that he was involved in that were not featured.
The Books Dataset
The Projects Dataset
The Ó Riada Scores Dataset
The Letters Dataset
The Events Dataset
The Finding Aid Dataset
Another particularly useful dataset that resulted from data collection was the Seán Ó Riada Collection “Finding Aid” or “Descriptive List”, where data was taken from a pdf file that contained descriptions of each item in the collection, and was transferred into a dataset. This allowed me to explore the Finding Aid in new ways, and to make it machine-readable.
In the years since these datasets were created, there have been several requests by professionals from different areas who have enquired about the material. For example, librarians at UCC have requested copies of the data to help scholars and researchers who need specific information on the contents of certain items within the collection. Media professionals have enquired about my understanding of the contents overall – if there were pieces of music written by Seán Ó Riada during his time living in certain countries during his career, to bring context to film work that documents his journey in music. Other requests have asked about the Letters Dataset, to find out who were the most frequent senders of letters to Ó Riada throughout his life. This information is now easily accessible, as a computer can list the frequency of letters from specific people.
As a result of these requests, I decided to move the data from my own personal web server, and on to a trusted repository, Zenodo.org. Zenodo allows data to be described, funding body information to be linked, and a plethora of different research data management tools to be brought together. It is particularly helpful in connecting profiles such as ORCID, a platform that can display your research outputs and link versions of your name if you use variants.
In May 2024, the Ó Riada Scores dataset was uploaded to Zenodo, and it has already gained 25 downloads in just 64 views.
https://zenodo.org/records/11108406
Other datasets will follow, and it is hoped to include an extensive set of metadata that describes each dataset, how it could be useful to others, and what potential collaborations would mean for that data.
If some of our research data had a certain level of openness, we could see the results of our work in making data FAIR. Instead of engaging with FAIR principles then as a way to follow protocol, it could be seen as a way to keep our research data alive. We will only gain from what we put in. If data is not made available under open licenses, and described in a way that other researchers could use, then we won’t see the immense value that our research data could bring after a project is completed.
More to follow!
All the best,
Patrick
Leave a Reply