KB LAB News

<?xml version="1.0" encoding="utf-8"?> <rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0" xml:base="https://lab.kb.nl/"> <channel> <title>KB LAB News</title> <link>https://lab.kb.nl/</link> <description/> <language>en</language> <item> <title>Presenting DutchDraCor: A new KB dataset for computational approaches to early modern Dutch theatre</title> <link>https://lab.kb.nl/about-us/blog/presenting-dutchdracor-new-kb-dataset-computational-approaches-early-modern-dutch</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em>From February to October 2024, I had the privilege of working as a researcher-in-residence at the National Library of the Netherlands. The aim of my residency was to prepare the early modern theatre editions available in the DBNL for computational studies of early modern Dutch drama. In this blogpost I present the first result of that effort: a corpus of 180 early modern Dutch plays, fully encoded and annotated in TEI XML according to the standards prescribed by the international theatre database developed by the Drama Corpora Project (DraCor).</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>In the last three decades, the access to digital theatre corpora has increased dramatically. Because institutions such as the Folger Shakespeare Library, the Bibliothèque nationale de France (BnF) and the National Library of the Netherlands (KB) were quick to digitise their theatre collections, digital reproductions of early modern theatre editions have been available on the web for many years. The online access to these editions has opened up Western theatrical heritage to students, scholars and other readers. For the Dutch context, the main collections of digital theatre editions can be found in the Digital Library for Dutch Literature (DBNL, maintained by the National Library of the Netherlands) and the Census Nederlands Toneel (Ceneton, which was developed and edited by Ton Harmsen at Leiden University). Thousands of students have first discovered the richness of plays by Joost van den Vondel, G.A. Bredero or Catharina Verwers through those collections.</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>Recently, theatre scholars recognised the additional potential of the existing repositories for comparative research into historical theatre traditions. They applied methods and concepts from other fields, such as network theory (e.g. Moretti 2011), economics (Algee-Hewitt 2017) and the history of emotions (Leemans et al. 2017), to model features of the text and structure of those plays, such as character interactions, social composition of story worlds, or emotional expressions. Those abstractions enabled researchers to systematically compare hundreds or even thousands of plays. Including this larger scale of analysis in historical drama research is necessary to identify and describe patterns and developments across time periods and even (language) borders. A great introduction into the state of the art of this young research tradition can be found in the recently published collection <em>Computational Drama Analysis&nbsp;</em>(2024) edited by Melanie Adresen and Nils Reiter. For the Dutch context, my recent article on theatre society Nil Volentibus Arduum offers a good impression (Van der Deijl 2024).&nbsp;</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>The Drama Corpora Project</h3> <p>Computational approaches to historical drama benefited significantly from the initiative by Frank Fischer, Peer Trilcke and everyone else from the Drama Corpora Project (<a href="https://dracor.org/">DraCor</a>) to create a so-called ‘programmable corpus’ of theatre editions from multiple European languages (Fischer et al. 2019). All editions included in the DraCor-database are fully encoded in TEI XML, which means that all structural elements in the play are labelled separately: acts, scenes, speech turns, speaker indications, stage directions et cetera. Moreover, the text from the editions is manually annotated with additional metadata. All speech turns are disambiguated with unique character IDs and all characters received a gender label. This step enables the extraction of speech turns by character or by gender group, for example. By introducing a cross-lingual standard for the encoding and annotation of theatre editions, DraCor facilitates and standardises computational analyses of dramatic texts from various linguistic and cultural backgrounds. Moreover, through its interoperable&nbsp;API (Application Programming Interface; an entry point for computers), the database can be questioned and queried on various levels using programming languages.</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>However, until recently there was no Dutch corpus available in the DraCor infrastructure. Due to this omission, Dutch-language theatre traditions have been a blind spot in current debates on computational drama research (with a few exceptions: Leemans et al. 2017; Debaene et al. 2024). The rich theatre culture from the Dutch language field thus tends to be overlooked, even though the DBNL and Ceneton already contain hundreds of high quality transcriptions of early modern theatre editions. Converting those editions to DraCor’s encoding standard would connect Dutch drama to other European theatre traditions, which have always been key to its development. As a response to this lack of a suitable Dutch theatre corpus, I collaborated with experts from the KB and with students from the University of Groningen to develop a first selection of 180 fully encoded Dutch plays. This corpus has been integrated in the DraCor infrastructure and will continue to grow in the future under the name of the Dutch Drama Corpus (<a href="https://dracor.org/dutch">DutchDraCor</a>). In this blogpost, I briefly describe the characteristics of this corpus at the moment of its second release in October 2024.&nbsp;</p> </div> </div> </div> , <div class="image-alignment--text-image"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="row flex-start order"> <div class="col-xs-12 col-md-6"> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>DutchDraCor: characteristics and statistics</h3> <p>In October 2024, DutchDraCor contains 180 fully encoded plays, with 2.117 speaking characters or character groups (including 1.364 male character (groups) versus 514 female character (groups)). The corpus includes over 2.3M words, 60.021 speech turns and 5.750 stage directions. 151 editions were derived from DBNL and 29 plays (the complete oeuvre of theatre society Nil Volentibus Arduum) were collected from Ceneton. There are 68 distinct first authors or author groups represented in the corpus, excluding translators. Some authors or groups are overrepresented in the corpus due to their large production and overrepresentation in the source collections, such as chamber of rhetoric De Pellicaen (39 plays), Joost van den Vondel (31), Nil Volentibus Arduum (29), and Jan Harmensz Krul (9). The corpus contains both plays originally written in Dutch and Dutch translations of plays written in other languages – mostly French, Latin and Spanish –&nbsp; as translations and adaptations were an important part of the early modern literary production in the Low Countries.&nbsp;</p> </div> </div> </div> </div> <div class="col-xs-12 col-md-6"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/dracor1.png?itok=fIh8_eCE" width="283" height="380" alt="DutchDracor statistics"> </div> </div> </div> </div> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Coverage and representativeness</h3> <p>Assessing the coverage of DutchDraCor is not straightforward, as it is not evident how to measure representativeness. In an ideal world, the corpus should contain a representative sample of all plays that were written, printed and staged in Dutch between 1500 and 1800. Since there are no reliable estimations of all plays that were once written and or performed in the early modern Low Countries, I cannot quantify the relationship between sample and ‘population’. Instead, I will point to a few important biases in the corpus in terms of year of publication, genre and author gender.&nbsp;</p> <p>First of all, plays from the sixteenth and eighteenth century are underrepresented. The bar chart below shows the number of plays written or printed per decade, confirming that DutchDraCor is especially useful for studies of seventeenth-century drama. Future additions to DutchDraCor will need to correct for this temporal imbalance.</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/dracor2_numberofplays.png?itok=hTatMRxR" width="1300" height="608" alt="Dracor number of plays per decade"> </div> </div> </div> </div> </div> </div> , <div class="image-alignment--text-image"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="row flex-start order"> <div class="col-xs-12 col-md-6"> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>Secondly, the corpus is not balanced in terms of genre. The most frequent genre labels are tragedy (‘treurspel’), morality play (‘zinnespel’) followed by comedy (‘blijspel’) and farce (‘klucht’). There is also a temporal imbalance in this genre distribution, since most morality plays were written in the sixteenth century, whereas most tragedies were written and printed in the seventeenth and eighteenth century.&nbsp;</p> </div> </div> </div> </div> <div class="col-xs-12 col-md-6"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/dracor3_genres.png?itok=81MqqbQy" width="761" height="710" alt="Dracor genres"> </div> </div> </div> </div> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>Finally, there is no equal gender balance among the authors represented in the corpus.<strong>&nbsp;</strong>The vast majority of the authors in the corpus is male: only 18 plays (10%) were written by women: 7 by Katharyne Lescailje, 7 by Lucretia Wilhelmina van Merken, 3 by Catharina Questiers and 1 by Catharina Verwers. Like the biases concerning period and genre, this overrepresentation of male poets is partly caused by the imbalances in the used collections (DBNL and Ceneton) and in the extant source material digitised in those collections. Computational approaches of early modern Dutch drama based on DutchDraCor need to take these biases into account, including the stage in the history of the editions where the bias was introduced (from writing, to printing, collecting, archiving, digitisation and encoding).&nbsp;</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Where to find DutchDraCor</h3> <p>The full corpus is available through the&nbsp;<a href="https://dracor.org/dutch">DraCor infrastructure</a>, which offers various downloadable representations of the plays and the metadata about the plays. The corpus can also be downloaded in full via GitHub. DutchDraCor will continue to grow in the future, but a standalone version of the corpus described here will become available via Zenodo. Enjoy!</p> <p>In my next blog post, I will demonstrate the value of DutchDraCor by taking the gender balance in the corpus as a case study rather than a bias, questioning the position and visibility of female characters on the stages in the Low Countries.&nbsp;</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Acknowledgments</h3> <p>DutchDraCor was created with the assistance of many contributors: Teun de Vries, Hinke van Minnen, Mirthe Wubs, Mirte Triezenberg, Jasmijn van Valkenburg, Hilde Bos, Marc Bos, Melissa Nijboer, Thirza Fokkens, Jarick van der Wal, Anna Lap, Maurice Eeftink, Annechien Hussem, Jens Klein, Jan de Vries, Hidde van Deemter, Ivar Czudar, Evi Dijcks, Alie Lassche. Special thanks to Willem Jan Faber, Suzan Boreel and everyone from the digital scholarship team at the KB.</p> <p>For the creation of DutchDraCor Lucas van der Deijl was supported by the National Library of the Netherlands (<a href="https://www.kb.nl/">KB</a>) during a residency from February to October 2024 and by&nbsp;<a href="https://clsinfra.io/">CLS INFRA</a> during a Transnational Access (TNA) fellowship at the University of Potsdam in April 2024.</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>References</h3> <p>Adresen, Melanie and Nils Reiter. <em>Computational Drama Analysis: Reflecting on Methods and Interpretations</em>. Berlin, Boston 2024.</p> <p>Algee-Hewitt, M. ‘Distributed Character: Quantitative Models of the English Stage, 1550–1900’. <em>New Literary History</em> <em>48</em> (2017) 4, 751-782.&nbsp;</p> <p>Debaene, Florian, et al. ‘Early Modern Dutch Comedies and Farces in the Spotlight : Introducing EmDComF and Its Emotion Framework.’ <em>Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024</em>, edited by Rachele Sprugnoli and Marco Passarotti, ELRA and ICCL, 2024, 144–55.</p> <p>Deijl, L.A. van der. ‘Orde en rationalisme in het toneel van Nil Volentibus Arduum. Een computationele benadering van vroegmoderne verhaalmodellen’. <em>Spiegel der Letteren&nbsp;</em>66<em>&nbsp;</em>(2024) 1, 53-94.</p> <p>Fischer, Frank, et al.&nbsp;‘Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama’.&nbsp;In <em>Proceedings of DH2019: "Complexities"</em>, Utrecht University, 2019.&nbsp;</p> <p>Leemans, Inger, et al.&nbsp;‘Mining Embodied Emotions: A Comparative Analysis of Sentiment and Emotion in Dutch Texts, 1600-1800’. <em>Digital Humanities Quarterly</em> 11 (2017) 4.&nbsp;</p> <p>Moretti, Franco, ‘Network Theory, Plot Analysis’, <em>Stanford Literary Lab Pamphlets</em> 2 (2011).</p> <p>&nbsp;</p> </div> </div> </div> </description> <pubDate>Mon, 18 Nov 2024 09:38:20 +0100</pubDate> <dc:creator>celonie.rozema@kb.nl</dc:creator> <guid isPermaLink="false">396 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>Archiving the KB Lab</title> <link>https://lab.kb.nl/news/archiving-kb-lab</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em><strong>We are now archiving the KB Lab!</strong></em></p> <p>According to the&nbsp;<a href="https://www.communicatierijk.nl/vakkennis/rijkswebsites/verplichte-richtlijnen/archiefwet">Archive Act</a> of the Netherlands, government organizations should archive their websites. As the KB is an independent governing body (“zelfstandig besturingsorgaan” (ZBO)), our&nbsp;tasks fall under the scope of this Archive Act.</p> <p>Since October 8th, the KB Lab is being archived every single day. This process includes harvesting, preparing and uploading each page of the website, as well as the links between pages. This makes it possible to navigate the archive just like you would the website itself. Links to external URLs are excluded, but everything else is collected for the future in an accessible and sustainable way. The archive shows the state of the website on any given day, making it possible to compare over time and track changes.&nbsp;</p> <p>The KB Lab archive is publicly available at&nbsp;<a href="http://kb.webarchief.online/lab-kb-nl" title="kb.webarchief.online/lab-kb-nl">kb.webarchief.online/lab-kb-nl</a>. You can now also permanently find this link in the footer, at the very bottom of the website. We are looking forward to seeing the KB Lab archive grow with time!</p> </div> </div> </div> </description> <pubDate>Wed, 16 Oct 2024 14:33:35 +0200</pubDate> <dc:creator>celonie.rozema@kb.nl</dc:creator> <guid isPermaLink="false">392 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>10 reasons why the National Library of the Netherlands moved its Wikimedia-related publications from SlideShare to Zenodo, and keeps them on Wikimedia Commons</title> <link>https://lab.kb.nl/about-us/blog/10-reasons-why-national-library-netherlands-moved-its-wikimedia-related-publications</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em>In this article the Wikimedia coordinator of the KB, the national library of the Netherlands, explains why he decided to migrate all KB’s Wikimedia-related publications of the last 13 years from SlideShare to Zenodo, while retaining copies of them on Wikimedia Commons.</em></p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/kb_gebouw.png?itok=l_M5KZl-" width="1024" height="695" alt="Photo of the KB building with a square in front of it."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Main entrance of the KB national library of the Netherlands</em>. <em>Source: KB, national library of the Netherlands</em>.</p> </div> </div> </div> , <div class="image-alignment--text-image"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="row flex-start order"> <div class="col-xs-12 col-md-6"> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>As the <a href="https://www.kb.nl/over-ons/experts/olaf-janssen">Wikimedia coordinato</a>r of the KB, the <a href="https://en.wikipedia.org/wiki/Royal_Library_of_the_Netherlands" data-wp-title="Royal_Library_of_the_Netherlands" data-wp-lang="en">national library of the Netherlands</a>, sharing knowledge is an integral and important part of my job. Not only live during events, but also for knowledge reference and sustainability purposes after the events have ended.</p> </div> </div> </div> </div> <div class="col-xs-12 col-md-6"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/kb_logo.png?itok=vUVUzZLU" width="374" height="123" alt="KB|National library of the Netherlands"> </div> </div> </div> </div> </div> </div> </div> , <div class="image-alignment--text-image"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="row flex-start order"> <div class="col-xs-12 col-md-6"> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Wikimedia Commons</h3> <p>This is why I make all Wikimedia-related presentations, articles, reports, workshop materials, tutorials, videos and other publications available in the Dutch and English bilingual <a href="https://commons.wikimedia.org/wiki/GLAM_at_Koninklijke_Bibliotheek/Archive">KB Wikimedia presentations &amp; publications archive</a> on Wikimedia Commons. I’ve been doing so for the last 13 years. This ensures all such KB content:</p> <ul> <li>can be easily found via search engines (Google loves Wiki),</li> <li>is available for reuse in the <a href="https://meta.wikimedia.org/wiki/Complete_list_of_Wikimedia_projects">700+ Wikimedia projects</a> and communities,</li> <li>can be reused freely by the entire world because open (Creative Commons) licensing is mandatory,</li> </ul> </div> </div> </div> </div> <div class="col-xs-12 col-md-6"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/image_41cd7f.png?itok=BjUG71k5" width="259" height="299" alt="Wikimedia commons logo"> </div> </div> </div> </div> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><ul> <li>can be used as linked open data, independent of language, because multilingual <a href="https://commons.wikimedia.org/wiki/Commons:Structured_data">Structured Data</a> has been added to many of these files,</li> <li>is stored in a durable way; this year, Wikimedia Commons is celebrating its 20th anniversary and future financing and governance look strong.</li> </ul> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/KB-Wikimedia-presentations-and-publications-archive-28-August-2024.png?itok=SJUvQ0FO" width="1300" height="1041" alt="Screenshot of the Wikimedia page 'Glam at Koninklijke Bibliotheek / Archive"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Screenshot of the </em><a href="https://commons.wikimedia.org/wiki/GLAM_at_Koninklijke_Bibliotheek/Archive"><em>KB Wikimedia presentations &amp; publications archive</em></a><em> on Wikimedia Commons, dd 28 August 2024.</em></p> </div> </div> </div> , <div class="image-alignment--text-image"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="row flex-start order"> <div class="col-xs-12 col-md-6"> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>SlideShare and new needs</h3> <p>In addition to Wikimedia Commons, I also used to upload <a href="https://www.slideshare.net/OlafJanssenNL">my outputs to SlideShare</a>. When I started doing this 14 years ago, my publishing and sharing needs were relatively simple and naive, with visibility and findability being the main purposes. Back then, SlideShare met this need, as it is well-indexed by search engines, has a visual-first user experience, and can be embedded into LinkedIn and Twitter posts.</p> </div> </div> </div> </div> <div class="col-xs-12 col-md-6"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/slideshare_logo2.png?itok=KU00wXOv" width="1190" height="362" alt="Slideshare with logo (white background, two figures holding a frame)"> </div> </div> </div> </div> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>At that time I had no need for (or awareness of) more advanced functionalities, such as</p> <ul> <li>Sharing a wider variety of file formats, beyond the typical “presentation oriented” PDF and JPG, such as more “data oriented” formats like JSON, TSV, Markdown, Jupyter notebooks or zipped OpenRefine project files.</li> <li>Publishing constituent workshop materials under one unique and persistent identifier – i.e. making all slides, notes, guide for participants, CSV with data, Jupyter notebooks or OpenRefine files used for a typical (data)workshop accessible as a combined package via a single URL.</li> <li>Sharing large files, such as video recording of live presentations of 100s of MBs, or entire datasets of hundreds of hi-res images and their descriptive metadata files.</li> <li>Adhering to <a href="https://www.go-fair.org/fair-principles/">FAIR principles</a>.</li> <li>REST APIs to interact with the publications programmatically.</li> <li>OAI-PMH connectivity for exchanging publications with other repositories.</li> </ul> <p>Although Wikimedia and/or SlideShare can meet some of these needs, over the years I gradually found out that they were not really potent enough to meet my ever increasing publication needs for my work to be discoverable and reusable by as many people, institutions and machines as possible, now and in the future, in a durable way, with low risk of their URLs becoming 404s or not findable at all.</p> <p>Furthermore, the KB is a scientific institution and thus part of the international science and research communities. So as an employee of such an institution, I felt a need to also make my publications more strongly integrated into these communities. Both Wikimedia Commons and SlideShare fail to meet this need, neither of them has a strong user base in academic publishing.</p> <p>Finally, as a European user of SlideShare, without going into all (legal) details, I felt some stress about having my content hosted on servers located in the US and being under that jurisdiction. This stimulated me to look for strictly EU-based hosting alternatives.</p> <h3>New publication strategy, including Zenodo</h3> <p>The new needs and insights I developed over these years slowly but surely urged me to change my approach to storing and sharing my content. As the KB has been a part of the Wikimedia community for many years, an important requirement was to keep all Wikimedia-related KB output available through Wikimedia Commons.</p> <p>However, for the KB content hosted on SlideShare, I needed to find a better place, closer to home, on EU-based servers, with a service meeting as many of my demands as possible.</p> <p>After doing some (non-exhaustive) comparison, using the <a href="https://zenodo.org/records/7946938">Generalist Repository Comparison Chart</a> among others, I picked <a href="https://en.wikipedia.org/wiki/Zenodo?wprov=wppw2">Zenodo</a> as the best match for my needs.</p> <blockquote><p>In summary, <a href="https://about.zenodo.org/">Zenodo</a> <em>is a general-purpose EU-hosted data repository built on open-source software that accepts all forms of research output, including presentations, research papers, data sets, research software, reports, and any other research related digital artefact. For each submission, a persistent </em><a href="https://en.wikipedia.org/wiki/Digital_object_identifier" data-wp-title="Digital_object_identifier" data-wp-lang="en"><em>digital object identifier</em></a><em> (DOI) is minted, which makes the stored items easily citable.</em></p> </blockquote> <p>My choice for Zenodo was also influenced by my wish to comply with KB’s earlier choice of Zenodo as its preferred repository; employee-generated publications have been stored in its <a href="https://zenodo.org/communities/kbnl/records?q=&amp;l=list&amp;p=1&amp;s=10&amp;sort=oldest">KB community</a> since 2017.</p> <p>Gradually I started moving my Wikimedia-related content from SlideShare to Zenodo. Along the way I also improved and harmonized the descriptions and other metadata associated with the files. Of course I made sure to submit all my content to the KB community as well. I completed this task in the summer of 2024, making all these <a href="https://zenodo.org/communities/kbnl/records?q=&amp;f=subject%3AOlaf%20Janssen&amp;l=list&amp;p=1&amp;s=10&amp;sort=newest">presentations, articles, reports, tutorials, videos and other publications</a> now accessible from Zenodo.</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/zenodo_kb_28082024.png?itok=SdmCaZ28" width="762" height="958" alt="Screenshot of KB community page with this article mentioned on top."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Screenshot of KB’s Zenodo community page, dd 28 August 2024, now including all Wikimedia-related publications that were previously hosted on SlideShare. </em><a href="https://zenodo.org/communities/kbnl/records?q=&amp;f=subject%3AOlaf%20Janssen&amp;l=list&amp;p=1&amp;s=10&amp;sort=newest"><em>Source</em></a><em>.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>My Top 10 Zenodo features</h3> <p>Let us now look at the ten Zenodo features that are most relevant for meeting my personal needs and wishes, in comparison to SlideShare and Wikimedia Commons. I have no ambition to discuss the complete feature set Zenodo has to offer; if you are interested in that, you can check Zenodo’s detailed public <a href="https://about.zenodo.org/">About</a>, <a href="https://about.zenodo.org/principles/">Principles</a>, <a href="https://about.zenodo.org/policies/">General Policies</a>, <a href="https://about.zenodo.org/privacy-policy/">Privacy Policy</a> and <a href="https://about.zenodo.org/infrastructure/">Infrastructure</a> documents.</p> <p>The 10 advantages of Zenodo most important to me are :</p> <p><strong>1) Sharing a wide variety of file formats</strong>, beyond the typical “presentation-oriented” PDF and JPG, such as more “data-oriented” formats like JSON, TSV, Markdown, Jupyter notebooks, or zipped files. Zenodo is very flexible, as it <a href="https://help.zenodo.org/docs/deposit/about-records/">supports uploading files in any format</a>. Wikimedia Commons has a <a href="https://en.wikipedia.org/wiki/Help:Creation_and_usage_of_media_files?wprov=wppw2" data-wp-title="Help:Creation_and_usage_of_media_files" data-wp-lang="en">smaller set</a> of supported formats and SlideShare only allows <a href="https://support.scribd.com/hc/en-us/articles/360055259612-Supported-SlideShare-file-formats-and-sizes">PowerPoint, PDF and Word</a> files.</p> <p><em>Example:</em> The <a href="https://zenodo.org/records/7665231"><em>Workshop OpenRefine en Wikimedia Commons (19-11-2022)</em></a> hosts files in tar.gz, pdf, xlsx, txt and json formats.</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/zenodo_multiple_files.png?itok=m-t6e0lq" width="903" height="528" alt="Screenshot of Zenodo showing multiple files stored there. "> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Screenshot of the files used in ‘</em><a href="https://zenodo.org/records/7665231"><em>Workshop OpenRefine en Wikimedia Commons</em></a><em>’, hosting tar.gz, pdf, xlsx, txt and json formats.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>2) Publishing constituent workshop materials</strong> under one unique and persistent identifier. When running a typical data-oriented workshop, it is very convenient – both for the participants and for post-workshop archiving – to publish all materials (slides, notes, guide for participants, csv with data, Jupyter notebooks, Openrefine project files, etc.) as a combined package in a single location under a single durable URI or DOI. This is a real added value of Zenodo, as both SlideShare and Wikimedia Commons allow only one file per upload.</p> <p><em>Example:</em> This <a href="https://zenodo.org/records/8207914"><em>OpenRefine Introduction Workshop</em></a> (2023) packages six files (pdf, xlsx, txt, and md) into a single upload.</p> <p><strong>3) Sharing large files, </strong>with personal use cases including video recording of live presentations of hundreds of MBs, or datasets containing large numbers of hi-res images and their descriptive metadata files (1+ GB). SlideShare has a <a href="https://support.scribd.com/hc/en-us/articles/360055259612-Supported-SlideShare-file-formats-and-sizes">300MB limit</a> and Wikimedia Commons allows 100MB by default, or up to 5GB when using <a href="https://commons.wikimedia.org/wiki/Help:Chunked_upload">chunked uploading</a>. Zenodo has a default <a href="https://about.zenodo.org/policies/">50GB file size limit&nbsp;</a> for each publication, which is more than enough for my needs.</p> <p><em>Example:</em> Hosting the 397 MB video <a href="https://zenodo.org/records/13268860"><em>Wikidata in de praktijk bij de Koninklijke Bibliotheek, masterclass Wikidata (28-05-2021)</em></a> is well within Zenodo’s file size limits.</p> <p><strong>4) Restricting access to files</strong> and <strong>saving drafts </strong>are <a href="https://help.zenodo.org/docs/deposit/about-records/#access">features in Zenodo</a> that allow editors to work on (drafts of) records in a closed-off environment before releasing them to the public. Wikimedia Commons does not offer this, as every file there is publicly accessible, and under version control right from the start, making it unsuitable for drafting temporary or permanent non-open publications.</p> <p><em>Example:</em> I’m currently preparing a <a href="https://theplant.maastrichtuniversity.nl/event/navigating-the-world-of-wikidata-for-research-science-and-cultural-heritage-2/">Wikidata introduction workshop for Maastricht University</a> and already uploaded some draft materials for this event to Zenodo. When the preparations are finished, I will openly share the files with the participants via <a href="https://doi.org/10.5281/zenodo.11224632">https://doi.org/10.5281/zenodo.11224632</a> &nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/zenodo_maastricht_15okt2024.png?itok=eTuhWt3r" width="796" height="681" alt="Screenshot of Zenodo page of Olaf Janssen with 'Introduction to Wikidata for Maastricht University, theory and practice, 15 October 2024 article prominently featuring.."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Screenshot of restricted draft materials for an upcoming Wikidata workshop on 15 October 2024. Files will be accessible via </em><a href="https://doi.org/10.5281/zenodo.11224632"><em>https://doi.org/10.5281/zenodo.11224632</em></a><em> after that date.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>5) Wider licensing options</strong> compared to Wikimedia Commons and SlideShare. By default, I publish and share my content under open licenses, either CC-BY, CC-BY-SA or even CC0. This makes my publications <a href="https://commons.wikimedia.org/wiki/Commons:Licensing#Acceptable_licenses">suitable for Wikimedia Commons</a> as well as <a href="https://support.scribd.com/hc/en-us/articles/360055664591-Your-SlideShare-content-settings-and-licensing">for SlideShare</a>.</p> <p>However, in some cases, I would like to have a bit more control over derivative or commercial reuses of my works, for which CC-BY-ND and/or CC-BY-NC licensing are appropriate instruments. This makes these kinds of works unsuitable for sharing on Wikimedia Commons, but SlideShare would still be an option, as these more restrictive licenses are allowed there.</p> <p>Zenodo accommodates all of the above scenarios, as it allows for both very open (CC0) and less open (CC-BY-NC/ND) licenses, as well as over <a href="https://spdx.org/licenses/">600 alternative options</a>, both for presentation-oriented and data-oriented publication. So in the vast majority of cases, there is always a suitable license to choose from.</p> <p>The only exception is the ‘All rights reserved’ case. It is allowed by SlideShare, but not on either open-oriented platform. However, I can use this limitation of Zenodo and Commons as a stimulant to always create publications that are suitable for sharing (be it under a CC-BY-ND/NC clause) and reuse when possible (without the ND clause).</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/spdx.png?itok=CUk8yLij" width="1023" height="737" alt="Screenshot of licenses used by Zenodo"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>List of over 600 licensing options used by Zenodo, from </em><a href="https://spdx.org/licenses"><em>https://spdx.org/licenses</em></a></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>6) Adhering to FAIR principles.</strong> Zenodo’s <a href="https://about.zenodo.org/principles/">Principles page</a> explains in detail how the repository adheres to the <a href="https://www.go-fair.org/fair-principles/">FAIR principles</a> of being Findable, Accessible, Interoperable and Reusable.</p> <p>This makes Zenodo more FAIR-compliant than Wikimedia Commons. Although all uploads get their own unique persistent M-identifiers (so-called Concept URIs, such as <a href="https://commons.wikimedia.org/entity/M144836996">https://commons.wikimedia.org/entity/M144836996</a>), these play a less manifest and significant role than their DOI equivalents in Zenodo. Multilingual findability is limited by the English-oriented <a href="https://commons.wikimedia.org/wiki/Commons:Categories">category names</a> used to group files.</p> <p>On the positive side, metadata of each upload is indexed and searchable via the Wikimedia Commons search engine immediately after publishing. Additionally, all files and their metadata can be requested for free via a set of <a href="https://www.mediawiki.org/wiki/API">REST-API</a>s without the need for a user account.</p> <p>However, interoperability is non-ideal as Commons uses metadata standards that are not necessarily compatible with ontologies in the cultural heritage and humanities domains. Also Commons does not offer support for standards like IIIF, although <a href="https://commons.wikimedia.org/wiki/Commons:International_Image_Interoperability_Framework">some attempts</a> have been made in the past. All Commons files can be reused because open (Creative Commons) licenses are mandatory.</p> <p>Finally, thanks to the <a href="https://commons.wikimedia.org/wiki/Commons:Structured_data">Structured Data on Commons</a> effort, Wikidata has been integrated into Commons. This means all <a href="https://ipres2019.org/static/pdf/iPres2019_paper_135.pdf">FAIR strongpoints of Wikidata</a> <em>(“Wikidata fulfills the most complete degree of FAIRness”</em>) are now exposed to Commons as well.</p> <p>As for SlideShare, it is a purely presentation-oriented and visual-first environment, with no explicit attention to being FAIR. It does for instance not use unique and persistent identifiers for its uploads, nor does it offer rich and explicit metadata for them. Publications are not linked to open, external vocabularies, and you will need to have an account to be able to download content and their (very basic) metadata.</p> </div> </div> </div> , <div class="image-alignment--text-image"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="row flex-start order"> <div class="col-xs-12 col-md-6"> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>7) Durability for files, governance and funding: </strong>Zenodo was launched in May 2013 and is hosted as an embedded <a href="https://home.cern/news/news/computing/cern-software-become-central-hub-eu-research">service of CERN</a>, which has existed for 70 years and currently has programs defined for the next 20+ years. Files are stored with <a href="https://about.zenodo.org/policies/">longevity</a> as a primary focus. <a href="https://about.zenodo.org/infrastructure/">Additional checks and balances</a> with respect to the infrastructure, servers, storage, security, governance, financing, and legal status all contribute to future-proofing the repository.</p> <p>This is at least on par with the sustainability of Wikimedia Commons, which has been in operation for 20 years now. It is part of the <a href="https://wikimediafoundation.org/">Wikimedia Foundation</a> portfolio, and as such, is funded and governed by the <a href="https://foundation.wikimedia.org/wiki/Home">overall policies</a> of the Wikimedia movement, with durability, transparency, openness and reliability at its core. Long-term funding of its projects is secured by the <a href="https://wikimediaendowment.org/">Wikimedia Endowment</a>, aiming to support Wikimedia projects in perpetuity.</p> </div> </div> </div> </div> <div class="col-xs-12 col-md-6"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/cern_logo2.png?itok=vCJpeCNp" width="460" height="439" alt="CERN logo (white background with blue lined circles)"> </div> </div> </div> </div> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>As for SlideShare, it is operated by <a href="https://www.scribd.com/home">Scribd</a>, a U.S.-based privately owned company funded by <a href="https://en.wikipedia.org/wiki/Scribd#Financials" data-wp-title="Scribd#Financials" data-wp-lang="en">venture capital</a>. Obviously, this is not an ideal situation to ensure long term accessibility of content hosted on SlideShare. This was another reason to migrate my content from there onto Zenodo.</p> <p><strong>8) Contributing to international research: </strong>The KB is a scientific organization and, as such, part of the international science and research communities. As an employee of such an institution, Zenodo allows me to automatically integrate my publications into scholarly aggregators, such as <a href="https://orcid.org/0000-0002-9058-9941">ORCID</a>, <a href="https://commons.datacite.org/orcid.org/0000-0002-9058-9941">DataCite</a>, <a href="https://explore.openaire.eu/search/advanced/research-outcomes?f0=authorid&amp;fv0=0000-0002-9058-9941">OpenAIRE</a>, <a href="https://www.base-search.net/Search/Results?lookfor=olaf+Janssen+koninklijke+bibliotheek&amp;type=allus&amp;page=1&amp;l=en&amp;oaboost=1&amp;refid=dcpageen">BASE</a> or <a href="https://search.fid-benelux.de/Search/Results?lookfor=%22Olaf+Janssen%22&amp;type=Author">FID Benelux</a>, making them more visible for study and research purposes, without any extra effort on my part.</p> <p>Both Wikimedia Commons and SlideShare are not well integrated into academic workflows and don’t have strong user bases in the scholarly community. By exposing my Wikimedia-related publications to this target group, I hope to raise more awareness of the benefits that Wikimedia can bring to their fields of work.</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/OPENaire_OlafJanssen.png?itok=xCirh8bS" width="1023" height="858" alt="Screenshot of an OpenAIRE Explore environment"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Screenshot of the author’s publications as listed in OpenAIRE on 22 August 2024. All records were automatically imported from Zenodo, based on the </em><a href="https://orcid.org/0000-0002-9058-9941"><em>ORCID of the author</em></a><em>. (</em><a href="https://explore.openaire.eu/search/advanced/research-outcomes?f0=authorid&amp;fv0=0000-0002-9058-9941"><em>Source of image</em></a><em>).</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>9) Fully hosted on EU-based servers:</strong> Zenodo is powered by the <a href="https://home.cern/science/computing/data-centre">CERN Data Centre</a>, physically located in France. This means it is subject to CERN’s legal status as an intergovernmental organization with immunity from the jurisdictions of the nation courts of the EU Member States (<a href="https://about.zenodo.org/infrastructure/">source</a>). This effectively safeguards Zenodo from being taken down due to legal procedures initiated by hostile agents.</p> <p>Although Wikimedia Commons <a href="https://wikitech.wikimedia.org/wiki/Data_centers">runs in data centers</a> partly located in the EU (Amsterdam and Marseille), the platform legally falls under U.S. jurisdiction, as its operator, the Wikimedia Foundation, is a nonprofit public charity under U.S. law.</p> <p><strong>10) The REST API and OAI-PMH interface </strong>allow for programmatic interactions with my publications. In accordance with the <a href="https://collectionsasdata.github.io/">Collections as Data</a> initiative, these APIs allow me to expose my “Publications as Data”, supporting computationally-driven sharing, research and teaching.</p> <p>Zenodo’s <a href="https://developers.zenodo.org/#rest-api">REST API</a> allows publication metadata to be requested directly in JSON format (<a href="https://zenodo.org/api/records/7665231">example</a>) that can be processed further via Python and other languages. For instance, with only a few lines of code, you can request all my publications <a href="https://public-paws.wmcloud.org/19781798/zenodo_records_olafjanssen_16082024.xlsx">as an Excel file</a>.</p> <p>Via Zenodo’s <a href="https://zenodo.org/oai2d?verb=Identify">OAI-PMH interface</a>, all publications in the <a href="https://zenodo.org/communities/kbnl/records?q=&amp;l=list&amp;p=1&amp;s=10&amp;sort=newest">KB community</a> can be requested in <a href="https://zenodo.org/oai2d?verb=ListMetadataFormats">various metadata formats</a>, including the library-oriented <a href="https://zenodo.org/oai2d?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=user-kbnl">OAI_DC</a> or <a href="https://zenodo.org/oai2d?verb=ListRecords&amp;metadataPrefix=marc21&amp;set=user-kbnl">MARC21</a> schemes. This protocol is used by the scholarly aggregators listed in point 8) to ingest KB publications into their repositories.</p> <p>As for Wikimedia Commons, the MediaWiki software used by this platform offers a set of rich and free <a href="https://www.mediawiki.org/wiki/API">REST-API</a>s allowing for a wide range of machine interactions with my publications. Furthermore, thanks to the <a href="https://commons.wikimedia.org/wiki/Commons:Structured_data">Structured Data on Commons</a> effort, all my publications will soon be available as Linked Open Data that can be queried via the <a href="https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service">Wikimedia Commons SPARQL service</a>. Unfortunately, Commons does not offer an OAI-PMH interface.</p> <p>SlideShare <a href="https://www.linkedin.com/blog/engineering/archive/introducing-slideshare-api-explorer">used to have</a> a REST API, but it seems to have been <a href="https://groups.google.com/g/slideshare-developers">inactive for several years</a> now.</p> <h3>Downsides of Zenodo</h3> <p>Given all these positive features, you might think there are no downsides to sharing your presentations, articles, tutorials, and other publications via Zenodo. Well, of course there are! Coming from SlideShare and Wikimedia Commons backgrounds, three drawbacks of Zenodo I’ve personally encountered are:</p> <p><strong>1) Zenodo does not have a vivid community</strong>. One of the very uniqe features of Wikimedia Commons is its very vivid community. It means <a href="https://commons.wikimedia.org/wiki/Commons:Tools">tools</a>, events, meetings, <a href="https://commons.wikimedia.org/wiki/Commons:Help_desk">help desk</a>, <a href="https://commons.wikimedia.org/wiki/Commons:Village_pump">village pump</a>, <a href="https://commons.wikimedia.org/wiki/Commons:Discussion_pages">discussions</a>, documentation, and <a href="https://commons.wikimedia.org/wiki/Commons:Community_portal">all other things</a> Wikimedians have collectively created to make Commons such a valuable a <a href="https://en.wikipedia.org/wiki/Social_machine?wprov=wppw2">social machine</a>. Zenodo lacks such a group of external contributors advancing and supporting its development.</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Wikiman%C3%ADa_d%C3%ADa_4_-_PK_-_19.jpg?itok=ixBjGLWb" width="1024" height="768" alt="Screenshot of a lot of smiling people at a congres. All have keycards around their necks."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Group of Wikimedians during Wikimania 2024. </em><a href="https://commons.wikimedia.org/wiki/File:Wikiman%C3%ADa_d%C3%ADa_4_-_PK_-_19.jpg"><em>Source</em></a></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>2) Zenodo is not well known among regular web users</strong>, as it is not part of the mainstream internet. Compared to SlideShare, still <a href="https://www.semrush.com/website/slideshare.net/overview/">ranked in the top 600</a> of most visited websites worldwide, Zenodo is <a href="https://www.semrush.com/website/zenodo.org">far more long-tail</a> and niche (ranked 55K+). This lowers the chances for my publications to be spontaneously discovered via search engines or social media by general web users.</p> <p>This disadvantage is however mitigated by my publications also being available on Wikimedia Commons, which has a much better exposure to search engines compared to Zenodo. This makes my presentations findable outside the scientific communication and publication channels.</p> <p>Furthermore, most of my content is not aimed at the general public, but rather at professional, specialized target groups, including library and digital heritage professionals, data-oriented scholars (digital humanities), and the Wikimedia communities. These types of users will probably be more likely to visit Zenodo (and Wikimedia Commons) than SlideShare when looking for this sort of content.</p> <p><strong>3) Zenodo does not offer a visual-first user experience. </strong>This is especially apparent when looking at how search results for the query “koninklijke bibliotheek” are displayed on SlideShare versus Zenodo. The <a href="https://www.slideshare.net/search?q=koninklijke+bibliotheek">default gallery view</a> used by SlideShare is visually more appealing than the <a href="https://zenodo.org/search?q=koninklijke%20bibliotheek&amp;l=list&amp;p=1&amp;s=10&amp;sort=bestmatch">text-oriented display</a> of Zenodo’s results.</p> <p>For the publications themselves, a <a href="https://www.slideshare.net/slideshow/ing-huygens-delpherdata20140107ojforupload/29730483?from_search=2">typical SlideShare page</a> offers a much stronger visual-first experience than a <a href="https://zenodo.org/records/12569311">typical Zenodo record</a> page, which contains much more manifest textual (metadata) elements.</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/zenodo-textual.png?itok=mZj4FifA" width="797" height="883" alt="Zenodo screenshot showing the result for searching on 'koninklijke bibliotheek'. The search results are all about Wikipedia and written by Olf Janssen."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>For the query “koninklijke bibliotheek” the text-oriented </em><a href="https://zenodo.org/search?q=koninklijke%20bibliotheek&amp;l=list&amp;p=1&amp;s=10&amp;sort=bestmatch"><em>search results on Zenodo</em></a><em> do not offer the same visual-first user experience as the </em><a href="https://www.slideshare.net/search?q=koninklijke+bibliotheek"><em>same query on SlideShare</em></a><em>.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>Of course, these differences are hardly surprising because Zenodo was built as a scientific data-focused service, giving metadata the same (or even higher) priority as the object. But for users coming from the visual-first background of SlideShare, where most textual metadata is rather hidden, the differences are striking.</p> <p>Fortunately, using its REST API and some Python code, it would be very feasible to generate a thumbnail gallery view for all my publications in Zenodo, mimicking the <a href="https://commons.wikimedia.org/wiki/GLAM_at_Koninklijke_Bibliotheek/Archive">KB Wikimedia presentations &amp; publications archive</a> discussed at the start of this article.</p> <h3>Summary and conclusion</h3> <p>In this article, I describe my journey in transitioning from SlideShare to Zenodo for hosting and sharing my professional Wikimedia-related content, while continuing to maintain KB’s presence on Wikimedia Commons.</p> <p>As my publishing requirements evolved over time, the need for a more robust, versatile, and research-oriented platform became clear. While SlideShare initially met my basic needs for visibility and discoverability, its limitations—such as a narrow selection of file formats, lack of advanced metadata support, and concerns about long-term accessibility—prompted me to look for a better alternative.</p> <p>Zenodo emerged as the optimal choice due to its EU-based hosting, support for a wide range of file types, and strong alignment with the FAIR principles of data management. The platform’s ability to mint DOIs, manage large files, and integrate with academic repositories, along with its multi-level sustainability and machine interaction capabilities, significantly enhances the discoverability and reuse of my work within the international research community.</p> <p>Despite some drawbacks, such as the lack of a vivid community and its less visual-first user experience compared to SlideShare, Zenodo’s strengths in durability, flexibility, and scholarly integration far outweigh these concerns.</p> <p>In conclusion, Zenodo, in conjunction with Wikimedia Commons, provides a powerful and sustainable solution for managing and disseminating my professional outputs. By utilizing both platforms synergistically, I can ensure that my work remains accessible, reusable, and impactful for a diverse range of audiences, from general web users to specialized academic communities.</p> </div> </div> </div> </description> <pubDate>Mon, 09 Sep 2024 13:46:52 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">391 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>Link analysis sample set</title> <link>https://lab.kb.nl/dataset/link-analysis-sample-set</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em>In 2022 and 2023 two researchers started working with the KB web collection for the first time in the history of our web collection. This triggered the need for analysing the web-data within the WARC container files we hold. The result of this: datasets with hyperlinks. On this page you can find an example of the kind of dataset we used for their research. As example a kb.nl harvest was used.&nbsp;</em></p> <h3>Analysing hyperlinks in the KB web collection</h3> <p>The first time extracting hyperlinks was done by myself in preparation for the arrival of Researcher in Residence <a href="https://lab.kb.nl/person/dr-karin-de-wild" data-entity-type="node" data-entity-uuid="76b60cac-3632-4e22-ba94-235767efc94d" data-entity-substitution="canonical" title="Dr. Karin de Wild ">Karin the Wild</a> back in 2022. You can read how I extracted the hyperlinks and worked with them in Gephi in the blog series ‘<a href="https://lab.kb.nl/about-us/blog/analysing-hyperlinks-kb-web-collection" data-entity-type="node" data-entity-uuid="b0fa4f08-d7b6-47c2-9a57-5889c49e886e" data-entity-substitution="canonical" title="Analysing hyperlinks in the KB web collection">Analysing hyperlinks in the KB web collection</a>’.&nbsp;</p> <p>I learned a lot from this first experiment. The main thing I discovered was that I wanted to have the tags as well as the hyperlink in my dataset. They tell a lot about the type of hyperlink you extract and are needed if you want to do a thorough analysis. I also learned that the way you crop your hyperlink is <a href="https://lab.kb.nl/about-us/blog/wait-where-did-hyves-go-link-analysis-part-4" data-entity-type="node" data-entity-uuid="01e229d2-61af-48f3-91f0-4877c4c9688b" data-entity-substitution="canonical" title="Wait... Where did Hyves go?! - Link analysis part 4">very important for the outcome of your hyperlink analysis</a>. I took those experiences and used them when <a href="https://lab.kb.nl/person/jesper-verhoef" data-entity-type="node" data-entity-uuid="3d017eef-e63a-4571-a022-028e878e7d82" data-entity-substitution="canonical" title="Jesper Verhoef">Jesper Verhoef</a> became our Researcher in Residence in 2023. Changes to our original extraction script improved the dataset.&nbsp;</p> <blockquote><p>Fast forward to the end of 2022 and&amp;nbsp;&lt;/em&gt;&lt;a href="<a href="https://www.kb.nl/en/news/kbs-xs4all-web-collection-unesco-world-heritage-list">https://www.kb.nl/en/news/kbs-xs4all-web-collection-unesco-world-heritage-list</a>"&gt;&lt;em&gt;this collection was recognized as UNESCO heritage&lt;/em&gt;&lt;/a&gt;&lt;em&gt;. But… Was Kees right?&nbsp;</p> </blockquote> <p><em>Example how a hyperlink is embedded in a website (and therefore in a WARC file). Source: </em><a href="https://lab.kb.nl/about-us/blog/no-longer-xs4all" data-entity-type="node" data-entity-uuid="d3ec2356-bfe2-4c62-8bdd-d68fbaee0b44" data-entity-substitution="canonical" title="No longer XS4ALL"><em>No Longer XS4ALL</em></a><em>.&nbsp;</em></p> <p>&nbsp;</p> <h3>The dataset</h3> <p>For those interested I made a sample set based on a crawl from the website of kb.nl:</p> <table> <tbody> <tr> <td>Target Instance ID</td> <td>38712298</td> <td>&nbsp;</td> </tr> <tr> <td>Target Name</td> <td>Koninklijke Bibliotheek</td> <td>&nbsp;</td> </tr> <tr> <td>Schedule start</td> <td>06/07/2024 18:57:00</td> <td>&nbsp;</td> </tr> <tr> <td>URLs downloaded</td> <td>175.115</td> <td>limit: 700.000</td> </tr> <tr> <td>Data downloaded</td> <td>21.03 GB</td> <td>limiet: 50 GB</td> </tr> <tr> <td>Elapsed time</td> <td>12:14:54:33</td> <td>&nbsp;</td> </tr> <tr> <td>Seeds</td> <td> <p><a href="https://inschrijven.kb.nl/">https://inschrijven.kb.nl/</a>&nbsp;</p> <p><a href="https://galerij.kb.nl/">https://galerij.kb.nl/</a></p> <p><a href="https://www.kb.nl/"><strong>https://www.kb.nl/</strong></a><strong>&nbsp;</strong></p> <p><a href="https://collecties.kb.nl/">https://collecties.kb.nl/</a></p> </td> <td><strong>Bold = primary seed</strong></td> </tr> <tr> <td>Excluded from harvest</td> <td> <p>.*field_categories.*</p> <p><a href="http://acroeng.adobe.com/">http://acroeng.adobe.com/</a>.*</p> <p>.*f%5B0%5D=.*</p> <p>.*\.rss.*</p> <p>^[^/]+://[^/]*(youtube).*</p> <p>^[^/]+://[^/]*(facebook).*</p> <p>^[^/]+://[^/]*(google).*</p> </td> <td>&nbsp;</td> </tr> <tr> <td><strong>Important to note</strong></td> <td><strong>Harvest finished by operator</strong></td> <td>&nbsp;</td> </tr> </tbody> </table> <p><em>Table 1. Metadata from the kb.nl harvest.</em></p> <p>This target has four seeds, but the harvest was cut short by a quality assurance officer. This means only a part of the selected seeds were harvested. Because kb.nl is the primary seed, the harvester started with that seed and did not come around to the other before it was manually stopped.&nbsp;</p> <h3>Result</h3> <p>You can find the whole dataset in csv under the ‘examples’ tab. But here (table 1) is a little sneak peek. In the datasets I used you first have the ‘source’ website: the website on which the link was found. After this comes the ‘target’: the website to which the link refers. The third column is the weight: how many times the source website refers to the ‘target’ website. Lastly we have the type of URL. As mentioned above in the beginning this was only anchor or embedded. After updating the extraction script we now find the specific tag which surrounds the link. In the KB dataset it is mostly ‘a’ for anchor link. But you can also find some images and frames in there.&nbsp;</p> <table> <tbody> <tr> <td><strong>Source</strong></td> <td><strong>Target</strong></td> <td><strong>Weight</strong></td> <td><strong>Type_URL_v2</strong></td> </tr> <tr> <td><strong>kb.nl</strong></td> <td>webggc.oclc.org</td> <td>167154</td> <td>a</td> </tr> <tr> <td><strong>kb.nl</strong></td> <td>collecties.kb.nl</td> <td>166452</td> <td>a</td> </tr> <tr> <td><strong>kb.nl</strong></td> <td>youtube.com</td> <td>166288</td> <td>a</td> </tr> <tr> <td><strong>kb.nl</strong></td> <td>delpher.nl</td> <td>155586</td> <td>a</td> </tr> <tr> <td><strong>kb.nl</strong></td> <td>webwinkel.kb.nl</td> <td>152806</td> <td>a</td> </tr> </tbody> </table> <p><em>Table 2. Example of the data found inside the kb.nl_2024_links dataset.</em></p> <p><em>It is important to note that this dataset does not reflect the kb.nl website as a whole. As mentioned above the harvest was stopped manually. But even if the harvest was completed, it only shows what we have been managed to archive with Heritrix and WCT and what we managed to extract with our </em><a href="https://lab.kb.nl/about-us/blog/lets-get-some-data-link-analysis-part-1" data-entity-type="node" data-entity-uuid="4718eb88-9a98-4fac-931f-e603e088639a" data-entity-substitution="canonical" title="Let's get some data - Link analysis part 1"><em>WARC-link-extraction script</em></a><em>.</em></p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/kb.nl_most_linked.png?itok=KJacxW0r" width="1024" height="521" alt="Link web with kb.nl referring to a number of other websites. Some lines are thicker because kb.nl refers to them more often"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Figure 1: Example of what you can do with the dataset. In this case it shows the websites kb.nl most often refers to.&nbsp;</em></p> </div> </div> </div> </description> <pubDate>Tue, 27 Aug 2024 12:41:02 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">390 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>No longer XS4ALL</title> <link>https://lab.kb.nl/about-us/blog/no-longer-xs4all</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em>Early 2019 former conservator Digital Born Collections Kees Teszelszsky raised the alarm: XS4ALL websites were at significant risk of disappearing from the live web. The cause? KPN announced the end of its subsidiary company XS4ALL. Multiple colleagues within the KB (myself included) rose to the challenge and started archiving these websites under supervision of a collection specialist. Fast forward to the end of 2022 and&nbsp;</em><a href="https://www.kb.nl/en/news/kbs-xs4all-web-collection-unesco-world-heritage-list"><em>this collection was recognized as UNESCO heritage</em></a><em>. But… Was Kees right? What happened to the XS4ALL websites on the live web? Join me as I investigate how the collection quietly started disappearing from the live web.</em></p> <p>I started looking into the XS4ALL websites in June 2023. The reason was that a colleague pointed out that KPN announced that&nbsp;<a href="https://www.kpn.com/service/homepages.htm">it would stop hosting ‘home. kpn .nl’ pages</a>. This reminded me of the XS4ALL domain it also hosts. As I was involved in building the collection I was interested in what had happened to it on the live web, so I started investigating.</p> <h3><em>Do we still archive it?</em></h3> <p>First I checked how the collection was build and whether it was being archived by us. As of June 2023 the collection comprised 3.261 websites. Because the KB has a selective collection each website has its own database entry called a Target Record. This is why counting which ones are still active is an easy process. The collection was built during 2019/2020, with most websites archived in 2021 (graph 1). After that year the active portion of the collection started to dimmish a little. This is because of our QA process: when a website is no longer online the Target Record will be closed and the website is no longer harvested. Within 2021 and 2022 84 websites had been closed. Not that much considering the thousands we had archived. So far so good.&nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Graph%20archived%20websites.png?itok=BzOTsMAW" width="746" height="411" alt="Line Graph showing a steep growth in XS4ALL website in the KB collection with a little dip at the end."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Graph&nbsp;1: Number of XS4ALL websites archived by the KB.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3><em>What is actually archived?</em></h3> <p>Next I had a look at the harvest results. Because we have a selective archive, where we archive one website at a time, I can also easily check the metadata of the harvest result. For instance I can check the amount of bytes harvested. This is one of the criteria we use for our web collections quality assurance (QA). Four years of performing QA has taught our team that when a website is smaller than 1 MB, it should be checked whether it is still online. In these cases it is likely that the website is either offline or it has been migrated to another domain. However, this parameter applies to current <strong>modern</strong> websites. XS4ALL websites are much older, mostly dating from 1993 until – 2005, and therefore smaller. Sometimes the website only consists of one or two pages! This is why for the XS4ALL websites I maintained a limit of 1 KB for QA. In this case (as experience teaches us) it is 99% likely to be offline. I plotted the results in a graph, and started getting worried…</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Graph%20harvestresults.png?itok=-v3p0z0s" width="841" height="551" alt="Bar chart showing 4 bars with a little red and a lot of green. The last bar is 60 percent red."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Graph 2: The websites which are probably offline (smaller than 1 KB) have been coloured red. The websites which are probably still online (larger than 1 KB) have been coloured green. This graph was made in percentages of the total instead of absolute numbers so a comparison between years was possible.&nbsp;</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>I found the difference between 2019/2020 and 2021/2022 already pretty notable. The number of unsuccessful harvests increased significantly (in percentages). But in first 6 months of 2023 <strong>more than</strong> <strong>half of the harvested websites failed to harvest successfully.&nbsp;</strong>In absolute numbers this is about a third of the whole collection. I verified this on the live web and found that 95% of the websites with less than 1 KB harvested were indeed gone.</p> <h3><em>Logfiles: Http(s) response code</em></h3> <p>End 2023/early 2024 we received a research proposal to study our XS4ALL collection. To assist the researcher I returned to this project after a short break. Up until this moment I was only looking into metadata from our web archiving software the&nbsp;<a href="https://webcuratortool.org/">Web Curator Tool</a> but I felt this was not enough to aid the researcher in understanding the quality of our XS4ALL collection. As I had determined: a lot of websites were still being harvested while they were no longer online. I pondered if it was possible to get a better picture of the quality of each harvest.</p> <p>A colleague of mine had (on my request) previously examined a group of almost 19.000 XS4ALL homepages on the live web. From this group almost 55% returned the well-known 404 (Page not found) response matching my conclusions based on the metadata of 2023. This inspired me: what if we could get the status response from the WARC-files (where the actual archived data is stored)? This would give us a better view of the state of a website during harvesting!</p> <p>Once again I turned to a teammate of mine and told him of my ideas. For practical reasons we used the logfiles (the crawl.log) of crawler Heritrix instead of the WARC files. My colleague created a list with all harvested URLs from the XS4ALL collection based on WCT data. Next he extracted the first 20 lines of the craw.log, where the response code was found, and matched it against the URL’s from the WCT data. Now we had the response code of every harvested URL for each archived version! With these results I could once again analyse the quality of the collection.&nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Graph%20responce%20codes%20v2.png?itok=rLgUm8o6" width="906" height="657" alt="Bar chart showing 3 bars with a lot of green. the fourth bar is 60 percent green, 15 percent yellow en 15 percent shades of red. The last bar is more or less 55 percent red and 45 percent green."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Graph 3: Overview of all found http response codes. Green being websites that still existed, other colours are websites that had either moved or were not reachable.&nbsp;</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3><em>Page not found…..</em></h3> <p>So again: in 2019/2020 everything was fine. Most of the websites were successfully harvested. This is important as it was the period where most websites were harvested for the first time as we were building the collection. In 2021 serious problems began to emerge: about 10% of the URL’s returned a 403 – forbidden response, meaning the website was inaccessible. In 2022 the problems became more severe and diverse. Almost 40% of the websites were not harvested because of multiple http response states. Besides a 403 response, websites started vanishing due to a 404 (not found) or a 302 (temporary moved) response. The 302 response was an interesting one. Close reading of this group revealed that when the domain was a x.home.xs4all.nl/ URL, it usually redirected to a <a href="https://www.xs4all.nl/unknownuser/xs4all/x">https://www.xs4all.nl/unknownuser/xs4all/x</a> URL. You then got the page: ‘Oops, page not found’ 404 page from the main XS4ALL website.&nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/XS4ALL%20unknownuser%20pagina.png?itok=UEpozPhY" width="1300" height="880" alt="Webpage in KB webarchive Wayback Machine: XS4ALL website with the text 'Oeps... pagina niet gevonden. 404 - de pagina bestaat niet meer of heeft u misschien een typefout gemaakt?&quot;"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Screenshot 1: Screenshot of a XS4ALL website May 2022. The website redirects to a page which states that the website is no longer online. Made from the KB Wayback Machine.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>By 2023, the collection's state on the live web significantly deteriorated. The websites on the live web which were unreachable in 2022 (because of the 302 and 403) now returned the general 404 – Not Found response code. Marking the end of the websites. More than half of the websites we tried to harvest had disappeared from the live web.&nbsp;</p> <h3><em>But…. Why?</em></h3> <p>What could be the reason for the disappearance of these website? Is KPN responsible? Since 2022 KPN has been integrating XS4ALL in its own infrastructure. Starting from 2023&nbsp;<a href="https://www.kpn.com/service/xs4all-overzicht.htm">XS4ALL began using KPN technology</a> and from then on creating a new XS4ALL homepage became impossible. This aligns with KPN's cessation of hosting its homepages&nbsp;(http:// home. kpn .nl/ websites). Or is it because of the new competitor, “Freedom Internet”? Founded on November 11, 2019, in response to KPN's takeover of XS4ALL, customers might have moved there, ending their XS4ALL accounts.</p> <p>Whatever the reason: I am very happy that Kees sounded the alarm back in 2019. Because even though many websites are now disappearing from the live web, the KB still has an extensive collection of XS4ALL websites safely stored in its archive.&nbsp;</p> <p>&nbsp;</p> <h3><em>Related research</em></h3> <p><strong>When Online Content Disappears</strong></p> <p><em>38% of webpages that existed in 2013 are no longer accessible a decade later &amp; Methodology</em></p> <p>By Athena Chapekis, Samuel Bestvater, Emma Remy and Gonzalo Rivero.&nbsp;</p> <p><a href="https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/">https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-di…</a></p> <p><a href="https://www.pewresearch.org/data-labs/2024/05/17/methodology-link-rot/">https://www.pewresearch.org/data-labs/2024/05/17/methodology-link-rot/</a></p> <p>&nbsp;</p> </div> </div> </div> </description> <pubDate>Thu, 25 Jul 2024 15:24:01 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">381 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>Continuing the pop-up and movable books research </title> <link>https://lab.kb.nl/about-us/blog/continuing-pop-and-movable-books-research</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>In&nbsp;<a href="https://lab.kb.nl/about-us/blog/experiencing-pop-and-movable-books-through-mixed-reality-technology">my previous blog post</a>, I introduced my research project, on characterising the materials experience of pop-up and movable books, and designing digitally-mediated interactions with their digital representations. In this post, I will give an update on all activities since the last post. Best news is that our paper describing this research, has since been accepted and was presented last week at world’s largest and peer-reviewed human-computer interaction conference, called CHI (pronounced ‘kai’). The full paper (describing the complete research project), can be found&nbsp;<a href="https://doi.org/10.1145/3613904.3642142">here</a>, and be referred to as follows:</p> <p><em>Willemijn S. Elkhuizen, Jeff Love, Stefano Parisi, and Elvin Karana.&nbsp;2024. On the Role of Materials Experience for Novel Interactions with Digital Representations of Historical Pop-up and Movable Books. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24). Association for Computing Machinery, New York, NY, USA, Article 619, 1–18.&nbsp;</em><a href="https://doi.org/10.1145/3613904.3642142"><em>https://doi.org/10.1145/3613904.3642142</em></a></p> <p>The paper provided many details on the conducted case study, but here I will describe the process (hopefully) in a bit more accessible fashion here.</p> <h3>Making sense of the data</h3> <p>The last blog left off at the end of the data collection and preliminary data analysis phase. So let’s pick up from here. After the observation and interview data was collected, three great student research assistants (and some AI transcription tooling&nbsp;J) supported me with the transcription and coding of the data. For the data coding we made use of a software tool called Atlas.ti, which allows you combine and annotate all kinds of data types (incl. video, audio, and text files).&nbsp;</p> <p>For the data coding we used a theoretical thematic analysis approach, meaning that we searched for themes in the data, guided by a pre-existing theoretical framework. In our case we made use of the <em><strong>materials experience framework</strong></em> (<a href="https://doi.org/10.1145/2702123.2702337">Giaccardi and Karana, 2015</a>), which was initially designed to be used in the context of <em><strong>material-driven design&nbsp;</strong></em>(<a href="https://www.ijdesign.org/index.php/IJDesign/article/view/1965">Karana et al, 2015</a>), but was transferred to the cultural heritage context for the first time in this project. In particular, we made use of the frameworks’ materials experience <strong>levels</strong>, which recognizes that materials experience takes place on four different interrelated levels, namely the:&nbsp;</p> <ul> <li>Performative level, i.e. what actions the books evoke.</li> <li>Sensorial level, i.e. how the materials is sensed through for instance vision, sound, and touch).</li> <li>Affective level, i.e. the emotions elicited by the books, such as joy, boredom, excitement.</li> <li>Interpretive level, i.e. the meanings ascribed to the books, such as old, fragile, and expensive.&nbsp;</li> </ul> <p>We annotated the observational videos and interview data on the different interactions, sensorial characteristics, emotions and associations that could be observed or that were mentioned by the participants.&nbsp;</p> <h3>Interactions with pop-up and movable elements</h3> <p>In the video below, you can see several examples of the diverse interactions participants had with the physical books.&nbsp;</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <video controls width="800" height="500" muted loop autoplay> <source src="https://github.com/KBNLresearch/KBNLresearch.github.io/raw/main/Video1_PerformativityExamples.mp4" type="video/mp4"> </video> </div> </div> <div class="source"> <em> <sub> <div class="component-text"> <div class="text__content"> </div> </div> </sub> </em> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>Once we identified all the different interactions, they were grouped per type of interactive element, such as a pop-up structures, pull tabs, or rotations wheels. See figure 1 for some examples of the different interactions with pop-up elements. In this diagram we also added when specific sequences of actions could help observed (e.g. repetitive actions), and whenever actions have notable temporal characteristics (e.g. taking place fast or slow).</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Image1_PerformativityBooks.png?itok=6RURIE5v" width="1300" height="1214" alt="Showing people interact with a pop-up book. Lot's of different small images."> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Figure&nbsp;1: Interactions with pop-up elements in the books (complete image can be viewed in the paper).</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>Similarly, we identified and grouped the different interactions that could be observed with the pop-up and movable books in VR (see figure 2). These interactions together can be denoted as the performativity of the books (i.e. one of the four materials experience levels).</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Image2_PerformativityVRBooks.jpg?itok=HCQ8rQ0_" width="1300" height="716" alt="Lot of different little pictures showing a woman interacting with a virtual reality pop-up book. "> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Figure 2: interactions with the virtual pop-up and movable books</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Paper qualities exploited in pop-up and movable books</h3> <p>Next, we tried to understand how all these identified characteristics relate to the different <strong>material</strong> <strong>qualities</strong> of pop-up and movable books. Through an iterative process we arrived at five key qualities of paper exploited in pop-up and movable books, namely:</p> <ul> <li><strong>Fold-ability</strong>, related to the ability of paper to fold from a flat page, into 3D structures, and which is associated with repetitive actions, associations such as it being ‘like an explosion’, and emotions such as amusement, admiration, but also boredom.</li> <li><strong>Slide-ability</strong>, related to the ability of paper, to slide in front and behind each other, also eliciting&nbsp;repetitive actions, it being described as ‘toylike’, ‘playful’, and ‘nostalgic’, ‘intuitive’, and eliciting emotions like amusement.</li> <li><strong>Tear-ability</strong>, related to the ability of paper to damage (easily), associated with the&nbsp;ripping &amp; scratching sounds, slowly page turning, fear of damaging, and the association that the books are actually not (very) suitable for children.</li> <li><strong>Age-ability</strong>, relates to paper’s ability to show signs of use and where, which participants associate with ‘fragileness’, ‘old’, visual imperfections, and an (absence of) old-book smell.&nbsp;</li> <li><strong>Print-ability</strong>, relates to paper’s ability to be printed (with letters and images), and can thereby be associated with (not) reading, colorful and nostalgic associations, leading to sensory pleasure, but also sometimes experienced as overwhelming and confusing.&nbsp;</li> </ul> <p>These characteristics were summarized in five diagrams, such as can be seen in figure 3, which summarises the materials experience characteristics and their interrelationships (if identified), for the <strong>fold-ability</strong> quality.&nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/image%203a%20four%20levels%20material%20experience.jpg?itok=GINXYIud" width="1300" height="1300" alt="a semi circle schematic with four levels"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Figure 3: Materials experience on four levels, relating to the fold-ability of paper, as used in pop-up and movable books.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Designing a mixed reality and two VR experiences</h3> <p>Based on these characterisations, several ideas for demonstrators were conceptualised. I created one mixed reality experience, and two master graduation students both created their own VR experience as part of their master graduation project, one of which is also featured in the paper. Here, I must also acknowledge Jeroen Boots and Yosua Andoko from the TU Delft XR zone, who supported us all greatly, with the development of all the prototypes. I will explain the mixed reality demonstrator in a bit more detail here.</p> <p>Firstly, we digitized the book ‘Tip+top boven de wolken’, from 1964, by Vojtěch Kubašta, and used this as a basis for all the prototypes. Initially, we took a whole series of photographs of the original book from the national library collection. We then tried to reconstruct the 2D planes (i.e. the non-deformed, flat paper cut-outs), that make up the 3D shapes in the book (we already reasoned that 3D scanning is not suitable in this case). Reconstructing the planes turned out to be rather challenging using this approach (they need to have the correct dimensions, otherwise the book cannot fold correctly). In second instance, we used an identical book, bought from an antiques bookshop, which we put under a printer flatbed scanner. As all the elements are cut from one page, this was achievable. From this we traced all the parts which made up the book (in Blender). These could then be animated in the Unreal Engine, to create the mixed reality and VR prototypes.&nbsp;</p> <p>In the mixed reality prototype we aimed to create a memetic material experience, meaning the experience aims to mimic the interactions with the physical book. The XR prototype combines a dummy book, printed with fiducial markers, and a mixed reality headset (Image 4a). As the reader opens the book and turns the pages, the pop-ups are visualised as overlays onto the physical pages (Figure 4b). The movable parts can be interacted with via the eye-tracking functionality – though preferebly we would have liked to rely on more memetic hand tracking (this did not work reliably enough). On three pages, we also explored enhancing different material qualities. On this page we aimed to enhance the fold-ability. By having parts of the page change color, linked to the opening angle of the page, aiming to create surprise, and trigger the repeated actions of opening and closing pages (Figure 4c-d). On another page, we aimed to enhance the age-ability of the book, where the first user sees a page, restored to its original appearance, but with repeated interaction, the pages slowly age and degrade (Figure 4e). Finally we have a page which emphasises the tear-ability of paper, as the plane on the page flies off, you see and hear it ripping loose from the page (Figure 4f).&nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/Image3_XR_demonstrator.png?itok=uYpG4rYN" width="871" height="1300" alt="Showing one pop-up book with planes with a fysical and VR aspects combined "> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>Figure 4: XRLibris,a mixed reality prototype of the book ‘Tip+top boven de wolken’, uit 1964, origineel geschreven door Vojtěch Kubašta.</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>An early version of this prototype was demonstrated at The DH Benelux conference, held last year June (2023), in Brussels.&nbsp;</p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <video controls width="800" height="500" muted loop autoplay> <source src="https://github.com/KBNLresearch/KBNLresearch.github.io/raw/main/Video2_XRDemonstrator_v2.mp4" type="video/mp4"> </video> </div> </div> <div class="source"> <em> <sub> <div class="component-text"> <div class="text__content"> </div> </div> </sub> </em> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><h3>Implications, limitations and future work</h3> <p>From the characterisation and the development of these prototypes we show that the Materials Experience Framework, can well transferred from the (material) design context, and also be used in the context of cultural heritage. We see this case study as a first step towards a possible material-driven design methodology for the augmentation of tangible cultural heritage.&nbsp;</p> <p>Some of the challenges we encountered relate to disentangling the different material qualities, and also separating experiences triggered by the books’ materiality, and other sources of experience (such as the text or depictions eliciting associations and emotions).&nbsp;</p> <p>Finally we must acknowledge some limitations of this case study. In the demonstrators we do not tackle the practical challenges, that need to be resolved for actual implementation of such an experience in a library or museum context. This includes the important aspect of enabling social interaction, which is currently not (well) addressed but deemed important in cultural heritage experiences (i.e. people like to share cultural heritage experiences, or experience things in larger groups).&nbsp;</p> <p>Another (potential) limitation is that we used surrogate books to characterise the materials experience. This might not be possible or viable for many other heritage artifacts (i.e. we might not have surrogates available, or they might be to expensive to create).&nbsp;</p> <p>We hope to extend on this work in the future, by for instance exploring other ways of materials experience characterisation, and expanding the characterisation beyond the ‘use’ of the CH artifacts (i.e. also look the role of materiality in making, or conservation of CH artifacts). We also aim expand on the design phase of the methodology, exploring how designers might be better supported in translating the materials experience insights into meaningful, digitally-mediated experiences. Finally, the evaluation of the prototypes on their materials experience, is currently ongoing, which might provide us with further insight, into how well we are able to capture and translate the materials experience in the end.&nbsp;</p> <h3>Thanks to…&nbsp;</h3> <p>I would really like to thank the KB for providing me this opportunity to work with their collections and people, which led to this fantastic result. I really enjoyed working at the KB and with people from the KB, also getting an insight into all the other research topics they work on. This work has really helped me to shape my further research directions.&nbsp;</p> <p>I would also like to thank my co-authors, for their excellent contributions to the paper, making the research not only better content wise, but also greatly helping articulate this in writing. Finally, I would like to thank the heirs of Vojtěch Kubašta, who kindly let us use his work for the XR demonstrators, the KB imaging department, and the XR Zone for helping us realise the demonstrators.&nbsp;</p> </div> </div> </div> </description> <pubDate>Mon, 15 Jul 2024 10:52:27 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">378 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>From Hyperlinks to Queer Histories: Uncovering LGBTQ Web Archives </title> <link>https://lab.kb.nl/about-us/blog/hyperlinks-queer-histories-uncovering-lgbtq-web-archives</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em>In 2023, as </em><a href="https://lab.kb.nl/person/jesper-verhoef" data-entity-type="node" data-entity-uuid="3d017eef-e63a-4571-a022-028e878e7d82" data-entity-substitution="canonical" title="Jesper Verhoef"><em>KB Researcher-in-Residence</em></a><em> and </em><a href="https://www.cais-research.de/fellows/jesper-verhoef/"><em>fellow</em></a><em> at the Center for Advanced Internet Studies (CAIS), I started to study the unique collection of archived LGBT+ websites the National Library of the Netherlands (KB) holds. In my </em><a href="https://lab.kb.nl/about-us/blog/analyzing-lgbt-web-archive-data-preservation-preparation" data-entity-type="node" data-entity-uuid="1dc5b313-cd16-47df-b6c1-8e0299fd14cd" data-entity-substitution="canonical" title="Analyzing the LGBT+ Web Archive: From Data Preservation to Preparation"><em>previous blog</em></a><em> I outlined how I prepared a dataset. In this blog, I will briefly review steps subsequently taken and the future that lies ahead.&nbsp;</em></p> <p>With the help of <a href="https://lab.kb.nl/about-us/blog/lets-get-some-data-link-analysis-part-1" data-entity-type="node" data-entity-uuid="4718eb88-9a98-4fac-931f-e603e088639a" data-entity-substitution="canonical" title="Let's get some data - Link analysis part 1">a script developed by Hanna Koppelaar</a>, which we finetuned, we extracted all hyperlinks from the corpus of hundreds of LGBT+ websites (2008-2022). Analyses of this dataset – including network analyses, or historical hyperlink analyses, using Gephi – are ongoing, but I have already presented interesting findings at various venues and will continue to do so in the next years. Besides presenting internally at the KB and CAIS, I have shared my work at a Cultural Analytics Lab meeting in Amsterdam and at conferences in Prague, Vienna, Southampton, Paris, Hilversum and Utrecht. I made sure to present at conferences related to both the topic (e.g., the LGBTQIA Research Day 2023), web archives as a source (IIPC Web Archiving conference 2023 and 2024), and method (Sociohistorical Network Analysis conference 2024). For a full overview, see <a href="https://pure.eur.nl/en/persons/jesper-verhoef">here</a>.&nbsp;</p> <p>This is also the moment to look back and thank the team I worked with. The project would not have been the same without the close collaboration with <a href="https://lab.kb.nl/person/willem-jan-faber" data-entity-type="node" data-entity-uuid="fd716d2e-defc-46cc-864c-0f104e6578c7" data-entity-substitution="canonical" title="Willem Jan Faber">Willem Jan Faber</a>, <a href="https://lab.kb.nl/person/iris-geldermans" data-entity-type="node" data-entity-uuid="c9e91676-1243-4e74-98b1-67d32f81a90a" data-entity-substitution="canonical" title="Iris Geldermans">Iris Geldermans</a> and <a href="https://lab.kb.nl/person/michel-de-gruijter" data-entity-type="node" data-entity-uuid="8b72666a-fda9-4f20-81d0-ae9209fb086b" data-entity-substitution="canonical" title="Michel de Gruijter">Michel de Gruijter</a>. A highlight was <a href="https://informatieprofessional.maglr.com/hyperlinkanalyse-lhbt-websites">the article</a> that Iris and I wrote for the journal <em>Informatieprofessional </em>(in Dutch), which – also due to its beautiful design – is a wonderful example of what in current academic lingo is referred to as valorization: bringing scholarly work to a wider audience.&nbsp; Special thanks also go to all KB staff working on the web archive. You are instrumental in preserving the past, especially the histories of marginalized groups. As I argue in the first <a href="https://www.tandfonline.com/doi/full/10.1080/24701475.2024.2357897">peer-reviewed article</a> my project has resulted in, ‘Doing LGBTQ Internet Histories Justice: a Queer Web Archive Manifesto’ (<em>Internet Histories</em>, 2024), the KB has done a terrific job, and other libraries should follow suit.&nbsp;&nbsp;</p> <p>The article also argues that scholars should finally start using existing queer web collections. In the next years, I will continue to do so and, thus, hope to inspire others and prompt more web archive research. More articles will follow, e.g., on the networks that religious LGBTQ websites formed, as well as on the relations between websites of transgender and gay organizations. I am excited that the Netherlands Organization for Scientific Research <a href="https://www.eur.nl/en/eshcc/news/jesper-verhoef-receives-nwo-sgw-open-competition-grant">(NWO) has recently awarded my project</a> ‘A Marriage of Convenience? A Web-Historical Analysis of the Relations between Transgender and Gay Organizations (2008-2023)’ an NWO SGW Open Competition grant for excellent, curiosity-driven research. The acronym LGBT suggests a natural alliance between lesbian, gay, bisexual, and transgender people. However, it masks the fraught history between ‘LG’ and the only recently added ‘T’. Although the Web provided a key battleground for gay/transgender relations, how these relationships have taken shape online is still unclear. To address this gap, this project is the first to study a web archive. Drawing on hyperlink analyses and interviews, it will provide quantitative and qualitative insights into the online ties and tensions between ‘LG’ and ‘T’ and contributes to the burgeoning field of web-historical research.&nbsp;</p> <p>In conclusion, then, my KB Researcher-in-Residency has been a crucial steppingstone for the prolonged project I will work on, which aims to highlight the online pasts of queer and other underrepresented individuals, the historical value of web archives, and the many opportunities that Digital Humanities and computational techniques offer for analyzing this data.&nbsp;&nbsp;</p> </div> </div> </div> </description> <pubDate>Thu, 20 Jun 2024 13:41:08 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">379 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>Verhoef receives NWO SGW Open Competition grant</title> <link>https://lab.kb.nl/news/verhoef-receives-nwo-sgw-open-competition-grant</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><strong>Former Researcher in Residence </strong><a href="https://lab.kb.nl/person/jesper-verhoef" data-entity-type="node" data-entity-uuid="3d017eef-e63a-4571-a022-028e878e7d82" data-entity-substitution="canonical" title="Jesper Verhoef"><strong>Jesper Verhoef</strong></a><strong> has been awarded a NWO SGW Open Competition grant for his research on the relation history between gay and transgender organisations.&nbsp;His project is the first to study a unique LGBT database, comprising 400 archived Dutch websites.&nbsp;</strong></p> <p>Read the full article:</p> <p><a href="https://www.eur.nl/en/eshcc/news/jesper-verhoef-receives-nwo-sgw-open-competition-grant">https://www.eur.nl/en/eshcc/news/jesper-verhoef-receives-nwo-sgw-open-competition-grant</a>&nbsp;</p> </div> </div> </div> </description> <pubDate>Mon, 10 Jun 2024 10:45:48 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">377 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>The remodeling of the KB Lab</title> <link>https://lab.kb.nl/news/remodeling-kb-lab</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em><strong>About two years ago the style of the website was renewed to match the new KB branding style. Now, we are going to tackle the lay-out.&nbsp;</strong></em></p> <p>As you might have noticed, a new row of content blocks has appeared on <a href="https://lab.kb.nl/">the front-page</a> featuring blogs, events and news. This is one of many little changes that will happen through-out 2024 to make sure the website matches our current and future needs. A lot will stay the same, but hopefully, the website will flow a little better and we will be able to better promote what is happening at the KB lab at this very moment. Such as <a href="https://lab.kb.nl/summerschool2024" data-entity-type="node" data-entity-uuid="d47e15a5-7e68-4e38-9c54-35b996b9774a" data-entity-substitution="canonical" title="KB Summerschool Digitale Collecties">the Summerschool</a> and the <a href="https://lab.kb.nl/news/opening-kbdatalab" data-entity-type="node" data-entity-uuid="d904bf6d-c020-4728-b0fd-9065fa603200" data-entity-substitution="canonical" title="Opening of the KB_Datalab">opening of the KB Datalab</a>.</p> <p>Please bear with us as we remodel our website!&nbsp;</p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/CK0067%20-%20Makersplaats%20-%2096x96%20KB%20LAB.png?itok=nt1L-TH6" width="1300" height="327" alt="A paintbrush and wrench crossing each other"> </div> </div> </div> </div> </div> </div> </description> <pubDate>Mon, 27 May 2024 15:32:56 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">376 at <a href="https://lab.kb.nl/"/></guid> </item> <item> <title>CHI NL Read: Cultural Heritage Artifacts and Material Experiences</title> <link>https://lab.kb.nl/news/chi-nl-read-cultural-heritage-artifacts-and-material-experiences</link> <description> <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p><em><strong>Chi NL, an organisation that connects, supports, and represents the Human-Computer Interaction (HCI) community in the Netherlands, has published an interview with our Researcher-in-Resident </strong></em><a href="https://lab.kb.nl/person/willemijn-elkhuizen" data-entity-type="node" data-entity-uuid="a39b14a0-07fc-4b6b-9ff4-acefa3ffae52" data-entity-substitution="canonical" title="Willemijn Elkhuizen"><em><strong>Willemijn Elkhuizen</strong></em></a><em><strong>. She has recently published a CHI 2024 paper called "On the Role of Materials Experience for Novel Interactions with Digital Representations of Historical Pop-up and Movable Books".&nbsp;</strong></em></p> </div> </div> </div> , <div class="paragraph paragraph--type--afbeelding paragraph--view-mode--rss"> <div class="component-text"> <div class="text__content"> </div> </div> <div class="field field--name-field-image field--type-entity-reference field--label-visually_hidden"> <div class="field__label visually-hidden">Afbeelding</div> <div class="field__item"><div> <div class="field field--name-field-media-image field--type-image field--label-visually_hidden"> <div class="field__label visually-hidden">Image</div> <div class="field__item"> <img loading="lazy" src="https://lab.kb.nl/sites/default/files/styles/max_1300x1300/public/images/3_Backup2_FindingASuitableBook.jpeg?itok=dwSIvsQL" width="975" height="1300" alt="Pop-up book with planes"> </div> </div> </div> </div> </div> <div class="field field--name-field-caption field--type-text-long field--label-visually_hidden"> <div class="field__label visually-hidden">Bijschrift</div> <div class="field__item"><p><em>[GIF] Selecting a suitable pop-up book from the KB collection for XR experience, featuring ‘Tip+Top boven de wolken’, by Vojtĕch Kubas̆ta (1914-1992), 1964 from Willemijn her blogpost on the KB lab: "</em><a href="https://lab.kb.nl/about-us/blog/experiencing-pop-and-movable-books-through-mixed-reality-technology" data-entity-type="node" data-entity-uuid="bb8bc0fb-06e2-4afb-aa16-f80e97cbcbdc" data-entity-substitution="canonical" title="Experiencing pop-up and movable books through mixed reality technology"><em>Experiencing pop-up and movable books through mixed reality technology</em></a><em>".&nbsp;</em></p> </div> </div> </div> , <div class="component-text"> <div class="text__content"> <div class="field field--name-field-body field--type-text-long field--label-hidden field__item"><p>You can read <a href="https://chinederland.nl/2024/04/chi-nl-read-cultural-heritage-artifacts-and-material-experiences/">the interview on the CHI NL</a> website or read her paper:&nbsp;</p> <p>Willemijn Elkhuizen, Jeff Love, Stefano Parisi, and Elvin Karana. 2024. On the Role of Materials Experience for Novel Interactions with Digital Representations of Historical Pop-up and Movable Books. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), May 11–16, 2024, Honolulu, HI, USA. ACM, New York, NY, USA, 18 pages. <a href="https://doi.org/10.1145/3613904.3642142">https://doi.org/10.1145/3613904.3642142</a></p> </div> </div> </div> </description> <pubDate>Fri, 24 May 2024 12:22:57 +0200</pubDate> <dc:creator>Iris.Geldermans@KB.nl</dc:creator> <guid isPermaLink="false">375 at <a href="https://lab.kb.nl/"/></guid> </item> </channel> </rss>

CINXE.COM

KB LAB News