Twilight Zones

Liminal Texts of the Long Turn of the Century (1880 - 1940)

Digital implementation

The liminal text corpus was encoded following the Guidelines of the Text Encoding Initiative (TEI). The editorial model aims at providing a transcription of the texts considering both the original text-structural level (e.g. headings, paragraphs, listings, etc.), and the semantic level like categories assigned to specific parts (paragraphs, phrases, and words) of the text representing the liminality.

A project-specific customization has been developed in the TEI ODD (One Document Does it all) format. This file includes the formal schema description of encoding mechanisms used for the project and a user-friendly documentation of encoding decisions.

Macro-structural level

On the macro-structural level each individual text is encoded in a separate TEI document, represented by the <TEI> element, containing a TEI header accommodating the metadata describing the electronic document, and the <text> element for representing the text logical structure of the resource in question. The <div> element is used to represent divisions and subdivisions of the text. In order to establish the connection to the original copy – since some of the texts are excerpts from compilations – and to further the user’s orientation, the beginning of a new page is marked with the <pb> element. The @n attribute supplies the original page number. Paragraphs (<p>) and lists (<list>) are used to further structure the texts.

TEI Header

The TEI header consists of a <titleStmt> describing the electronic document. Besides the title and the author of the document, the individual responsibilities of the digital anthology project are supplied. This includes the corpus compilation, the text correction, the text analysis, the creation of the data model, the TEI encoding, and the presentation. Information on the publisher, the distributor, and on legal issues is provided in the <publicationStmt>; the <sourceDesc> supplies bibliographic descriptions of both the source from which the electronic text is derived, and the first edition. In addition, the language and the individual revision steps are recorded. The categories actually used in the text are provided in the <standOff> element, a container for contextual information, defining the classificatory codes applied to certain text passages.

Micro-structural level

On the micro-structural level, footnotes are represented with the <note> element, the @n attribute is used to supply the character referencing the footnote in the source text. Text deliberately omitted by the editors has been marked with the <gap> element. The reason for the omission is indicated in the @reason attribute which can either be for editorial reasons, because the following part was considered as not relevant for the research question, or that the original copy contains an image which is not reproduced in the digital representation. Text that is represented in the original copy as typographic distinctive to the surrounding text such as italics, bold, letter spacing, superscript and capitalized text has been retained, utilizing the <hi> element to mark highlighted text. The corresponding values in the @rend attribute indicate how the text was presented in the source.

The <seg> element is used to highlight arbitrary text segments, phrases, or words that are representing the liminality of the texts. With the usage of the @ana attribute on the <seg> element, the text is associated with specific analysis and adds an additional interpretative layer.

Basis technologies

The “Twilight Zones” project is available in the research data repository GAMS (Humanities’ Asset Management System)[1], a digital repository for preserving, managing, and presenting Humanities’ research data. The infrastructure maintained by the Centre for Information Modelling – Austrian Centre for Digital Humanities conforms with the Open Archival Information System (OAIS) reference model[] aiming at reliability and long-term availability of digital information. It is based on the open source software FEDORA. The permanent citability of the data is guaranteed with Handles[2]. Citation suggestions are made for each text in the collection. The research data, the data models and the transformation scripts are reusable through the use of data standards (TEI, DC, RDF etc.) and are compatible with other systems. The digital archive contains the TEI and RDF data, as well as the corresponding SPARQL queries. GAMS integrates Cocoon services into the FEDORA repository to present this data and uses project-specific content models for the TEI data.