2022 TIP Award Winners Explore Bias in Library Catalogs
- Post by: Daniela Sygulla
- No Comment
As part of our master’s program “Digital Transformation of the Information and Media Industry" at HAW Hamburg, our four-person project group led by Inga Albrecht, Daniel Klein, Torge Plückhahn, and Paulina Triesch investigated possible biases in the online library catalog of the Hamburg Carl von Ossietzky State and University Library (SUB). The practical project was carried out in cooperation with the SUB in the period from December 2021 to February 2022.
Libraries use search engines to enable users to quickly explore the entire database. In doing so, the search engine should put all potentially relevant results in a meaningful order. Well-known web search engines, such as Google and Yahoo, claim to be neutral and objective. Library search engines do the same and even give the appearance of being more trustworthy. Library catalogs enjoy a certain level of trust from users because, as part of a neutral institution, they are also associated with neutrality.
However, a neutral search engine of the often used relevance ranking is almost impossible, so the real listing of search results often deviates from the ideal. In this case, we speak of a bias. For library search engines, little research on biased search results is available so far. Therefore, we have investigated the SUB’s online library catalog, the Katalogplus, in a quantitative research project for system-based biases in ranking.
Katalogplus contains the holdings of several libraries in Hamburg and a total data stock of over 100 million copies. Seven hypotheses were formulated for the study on the basis of a previous literature analysis:
H1: There is a correlation between the gender of the authors and the ranking position.
H2: There is a relationship between publisher size and ranking position.
H3: There is a relationship between the language of a work and its ranking position.
H4: There is a relationship between base classification and ranking position.
H5: There is a relationship between location and ranking position.
H6: There is a relationship between media type and ranking position.
H7: There is a correlation between the year of publication and the ranking position.
The possible biases in the relevance ranking of the catalogplus were investigated by means of a quantitative data analysis in which various statistical evaluation methods were applied.
Upon completion of the evaluation, biases were found in the Katalogplus ranking with respect to gender, language, research area, location, media type, and year of publication. The various manifestations of these variables were either significantly overrepresented or underrepresented in the top ranking positions of the Katalogplus. Therefore, the study shows that it is useful to examine the search results of ranking algorithms in library catalogs. Despite the trust placed in library systems, they can, just like traditional search engines, output rankings biased by certain characteristics.
During the subsequent discussion of the results, it became apparent that many biases were unexpected for the SUB. Thus, recommendations for action could be established to give the developers of the Katalogplus now a new approach to optimize the library catalog. In addition, further research approaches for future investigations emerged.
Much to our delight, the practical project was awarded the TIP Award 2022 as one of three student research projects by the professional journal b.i.t.online, Schweitzer Fachinformationen and the Conference of Information and Library Science Training and Study Programs (KIBA), Section 7 of the dbv and Training Commission of the German Association for Information and Knowledge. In addition to prize money, we were invited to the 8th Library Congress in Leipzig, where we were able to present our project to the jury and interested guests and received the award. Thanks to a travel grant, we were able to spend two days in Leipzig, which we enjoyed very much. We would like to thank you for the great experience and hope that our practical project has made a small contribution to research in the library cosmos.