TEI 2019

What is text, really? TEI and beyond


All DemonstrationsTEI

Recogito: from Semantic Annotation to Digital Scholarly Edition

Simon Rainer

Keywords: semantic annotation, Named Entity Recognition, gazetteer, digital scholarly edition, TEI
Permalink: https://gams.uni-graz.at/o:tei2019.175

Recogito: from Semantic Annotation to Digital Scholarly Edition

Rainer Simon

Gimena del Rio Riande (IIBICRIT, CONICET), (AIT), Elton Barker (The Open University), Leif Isaksen (University of Exeter), Rebecca Kahn (Alexander von Humboldt Institut für Internet und Gesellschaft HIIG), Valeria Vitale (University of London), Antonio Rojas Castro (Cologne Center for eHumanities, Universität zu Köln), Hugh Cayless (Duke University)

Recogito ( https://recogito.pelagios.org/ ) is a web-based environment for collaborative semantic annotation, developed by Pelagios ( https://commons.pelagios.org ). It is open source software, supports plaintext (.txt extension) as well as TEI/XML encoded text (.xml extension), and allows users to export the results of their work in different formats, including RDF, TEI/XML, GeoJSON, etc. Originally, the tool has been designed with a focus on scholarly geographic annotation, i.e. the transcription, marking up and geo-resolving of geographical documents such as itineraries, maps and travel reports. More recently, however, the feature set was expanded in order to provide more general annotation functionality. Perhaps the most notable feature of Recogito is the ability to produce semantic markup without the need to work with formal languages directly. Through an easy-to-use interface, users can navigate digitized documents; create personal collections; add tags and comments; build up tagging vocabularies, and geo-resolve place references by linking them to gazetteers. Users can either work alone in a closed workspace, or together as groups of collaborators. Recogito also makes it easy to apply Named Entity Recognition (NER) to TEI documents, with the possibility to choose between different recognition engines and authority files for entity resolution.

In this demo we will show how Recogito can serve as a useful environment for the efficient creation of minimal digital editions. Starting from plaintext source files, we will demonstrate the workflow for uploading content, creating semantic annotations, exporting to TEI, refining the markup, and publishing the results as an online digital edition. As a case study, we will present a geographically annotated corpus of early Argentinian texts. This edition was produced by semantically enriching sources with references to an early colonial american gazetteer, funded in part through a Pelagios Resource Development Grant in 2017.