CoReMA

Cooking Recipes of the Middle Ages

Representation of German Cooking Recipe Collections

The digital edition of the German language cooking recipe collections aims at providing a hyperdiplomatic transcription of the source texts. This kind of research output is meant to provide our own and other disciplines with source texts further research can be built upon.

The central aim is to represent the historical source using only characters from the ASCII table.

The characters of the historical source not available in the ASCII table are described with the help of the TEI Character Declaration. In the XML of the sources they are modeled with the <g> element.

General Rules

Macro structure

Macrostructuring elements:

  • Page break is modeled with the <pb> element and the attributes @xml:id, @n (page count).
  • Line break is modeled with the <lb> element and the attribute @n (line count).
  • Column break is modeled with the element <cb> and the attribute @n (alphabetical column count).

Micro structure

Microstructuring text elements are modeled:

  • Semantic text units (e.g. recipes) with the <seg> element.
  • Headings with the element <hi>, the attribute @style and the attribute value 'heading'. Non linear headings are transcribed diplomatically but modeled for transformation into a linear reading text with the elements <seg> and an @xml:id attribute.
    The position for linear reading is indicated with the element <anchor> and the attributes @type (value: 'heading') and @corresp referring to the respective @xml:id values (e.g. Bs1, fol. 23r).
  • Initials with the <hi> element, the attribute @style and the value 'initial'.
  • Rubrication with the element <hi>, the attribute @rend and the value 'textColour:RED;'.
  • Vertical rubricated strokes with the symbol | ('Pipe', U+007C) and the element <hi>, the attribute @rend and the value 'textColour:RED;'.
  • Underline with the element <hi>, the attribute @style and the value 'underlined;'.
  • Marginal entries or any other kind of insertion between text lines that are part of the running text are inserted in the text where they belong semantically. They are annotated according to their character (e.g. as addition, cf. below) and commented with the element <note>.
  • Catchwords are indicated with the help of the <note> element and described verbally.

Abbreviations

Abbreviations are modeled in the shortest possible way using the respective elements available: <abbr>, <ex>, <am>

<abbr> It<ex>e</ex>
<am><g ref="#bar_e"/>m</abbr>

Text revisions

Text revisions are modeled according to TEI P5 Guidelines using the standard elements.

Elements used to model revisions: <del>, <add>, <metamark>.

Editorial reconstruction

Changes by the editors are modeled with the element <supplied>, the attribute @reason provides more detailed information:

  • 'missing': text was not written down (e.g. a missing initial).
  • 'overbinding': text is bound into the fold.
  • 'illegible': text is not readable (e.g. blurred ink, bad handwriting, etc.).
  • 'damaged': carrier material is damaged (e.g. hole/puncture, corner missing, water damage, trimmed folio, missing folio).
  • 'list': omitted text of the source (e.g. text connected through lines or brackets as in Gr1, fol. 87r) is supplied.

Character representation

General guidelines

The text of the source is represented as detailed as feasible.

Illegible characters are indicated by the element <unclear>. The element can span a width of more than one character.

During the hyperdiplomatic transcription we build an inventory of special characters used in the manuscripts. All information on the general characters are collected in the character declaration (<charDecl>). Individual characters typical for a certain scribe are detailed in the script description (<scriptDesc>). In the course of the transcription workflow the characters are primarily used to generate a diplomatic text representation for text correction and manual collation with the digital facsimile. The basis for the description of characters is the defacto community standard of the Medieval Unicode Font Initiative (MUFI).

Characters that have no representation in the MUFI are represented through visually similar characters to allow the generation of the diplomatic collation view. In some cases they may even be characters from a script (as described in the MUFI character recommendation) other than the late medieval bastarda of the cooking recipe collections.

We focus on a human readable TEI-XML, therefore high frequent special characters like tailed z, round z, single or two storey a are transcribed against above rules: the less frequent characters are annotated, the more frequent characters represented by the respective character of the ASCII code chart. These deviations are recorded in the <scriptDesc>.

The list of special characters is subdivided into alphabetic characters, diacritics, abbreviation signs, and punctuation. The hierarchic structure of the list is aligned to the model for script description of the DigiPal project:

graphem/letter > character > allograph > idiograph > graph.

In our model the element <char> annotates a 'character'. The element <glyph>, including the attributes @n, @ana, the 'allograph'.

Individual <glyph> descriptions include the following information:

  • @corresp: the value indicates a hierarchical dependence between allographs and characters (e.g. between the <glyph> 'long s' (LATIN SMALL LETTER LONG S) and <char> 's' (LATIN SMALL LETTER S). The first is an allograph of the latter.
  • @xml:id: distinct identificator, naming conventions are based on the MUFI character recommendation.
  • @n: is used to differentiate abbreviation symbols ('abbr') from other characters.
  • @ana: the attribute values 'abbreviation_mark', 'brevigraph', 'contraction' are used to apply some kind of systematics to the abbreviation symbols. The values 'dia' and 'pct' indicate diacritics and punctuation.
  • @resp: reference source: the MUFI character recommendation.
  • @source: page reference within the MUFI character recommendation.

The names of the characters meet the unicode standard:

The element <mapping> holds information on the possible representations of a <char> or <glyph>:

  • plus the attribute @type with the value 'normalized' it indicates the character used for the normalized HTML output.
  • plus the attribute @type with the value 'transcription' it indicates the proprietary markup used in the initial transcription.
  • plus the attribute @type with the value 'unicode_codepoint' and the attribute @subtype, which has as value the respective unicode code chart, it names the unicode codepoint.
  • plus the attribute @type with the value 'encoding' and the attribute @subtype with the value 'html_entity' it provides the respective HTML entity.
  • plus the attribute @type with the value 'encoding' and the attribute @subtype with the value 'unicode_symbol' it provides the symbol of the unicode character.

There are no detailed descriptions for characters and character combinations which are part of the ASCII code chart.

Letters of the alphabet

One storey a and two storey a are differentiated. LATIN SMALL LETTER A INSULAR FORM is used to represent the two storey a.

Name:LATIN SMALL LETTER A
Type:Character
Symbol: a
MUFI:https://bora.uib.no/handle/1956/10699, page 12
Name:LATIN SMALL LETTER A INSULAR FORM
Type:Allograph
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 24

i/j with and without superscript are differentiated. i/j without superscript are annotated.

Name:LATIN SMALL LETTER I
Type:Character
Symbol: i
MUFI:https://bora.uib.no/handle/1956/10699, page 47
Name:LATIN SMALL LETTER DOTLESS I
Type:Allograph
Symbol: ı
MUFI:https://bora.uib.no/handle/1956/10699, page 48
Name:LATIN SMALL LETTER J
Type:Character
Symbol: j
MUFI:https://bora.uib.no/handle/1956/10699, page 52
Name:LATIN SMALL LETTER DOTLESS J
Type:Allograph
Symbol: ȷ
MUFI:https://bora.uib.no/handle/1956/10699, page 52

Straight r und round / rotunda r are differentiated.

Name:LATIN SMALL LETTER R
Type:Character
Symbol: r
MUFI:https://bora.uib.no/handle/1956/10699, page 82
Name:LATIN SMALL LETTER R ROTUNDA
Type:Allograph
Symbol: ꝛ
MUFI:https://bora.uib.no/handle/1956/10699, page 84

Long s and round s are differentiated.

Name:LATIN SMALL LETTER S
Type:Character
Symbol: s
MUFI:https://bora.uib.no/handle/1956/10699, page 85
Name:LATIN SMALL LETTER LONG S
Type:Allograph
Symbol: ſ
MUFI:https://bora.uib.no/handle/1956/10699, page 89

ß ligature is used if the individual characters long s and tailed z cannot be differentiated anymore.

Name:LATIN SMALL LETTER SHARP S
Type:Character
Symbol: ß
MUFI:https://bora.uib.no/handle/1956/10699, page 86

Two forms of 'w' are differentiated.

Name:LATIN SMALL LETTER W
Type:Character
Symbol: w
MUFI:https://bora.uib.no/handle/1956/10699, page 103
Name:LATIN SMALL LETTERS LIGATURE LB FOR W
Type:Allograph
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 89

Diacritics

  • Superscript and umlaut signs above <a>, <e>, <o>, <u>, <v>, <w>, <y> are generally transcribed without differentiating their real shape (e.g. dot, dash, tilde). If we really do differentiate (as is the case with some recipe collections), the respective information is available in the encoding description.
  • If superscripts resemble vowels, they are transcribed according to their shape.

Name:COMBINING CURLY BAR ABOVE
Type:Dia
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 130
Name:COMBINING RING ABOVE
Type:Dia
Symbol: ̊
MUFI:https://bora.uib.no/handle/1956/10699, page 128
Name:COMBINING LATIN SMALL LETTER A
Type:Dia
Symbol: ͣ
MUFI:https://bora.uib.no/handle/1956/10699, page 122
Name:COMBINING LATIN SMALL LETTER E
Type:Dia
Symbol: ͤ
MUFI:https://bora.uib.no/handle/1956/10699, page 123
Name:COMBINING LATIN SMALL LETTER I
Type:Dia
Symbol: ͥ
MUFI:https://bora.uib.no/handle/1956/10699, page 124
Name:COMBINING LATIN SMALL LETTER O
Type:Dia
Symbol: ͦ
MUFI:https://bora.uib.no/handle/1956/10699, page 125
Name:COMBINING LATIN SMALL LETTER U
Type:Dia
Symbol: ͧ
MUFI:https://bora.uib.no/handle/1956/10699, page 126
Name:COMBINING LATIN SMALL LETTER V
Type:Dia
Symbol: ͮ
MUFI:https://bora.uib.no/handle/1956/10699, page 126
Name:LATIN SMALL LETTER A WITH CIRCUMFLEX
Type:Dia
Symbol: â
MUFI:https://bora.uib.no/handle/1956/10699, page 14
Name:LATIN SMALL LETTER E WITH CIRCUMFLEX
Type:Dia
Symbol: ê
MUFI:https://bora.uib.no/handle/1956/10699, page 35
Name:LATIN SMALL LETTER O WITH CIRCUMFLEX
Type:Dia
Symbol: ô
MUFI:https://bora.uib.no/handle/1956/10699, page 71

Abbreviation signs

Abbreviations are modelled throughout.

Brevigraphs

Name:LATIN SMALL LETTER D WITH TAIL
Type:Brevigraph
Symbol: ɖ
MUFI:https://bora.uib.no/handle/1956/10699, page 30
Name:LATIN ABBREVIATION SIGN SPACING BASE-LINE US
Type:Brevigraph
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 134
Name:LATIN SMALL LETTER RUM ROTUNDA
Type:Brevigraph
Symbol: ꝝ
MUFI:https://bora.uib.no/handle/1956/10699, page 135
Name:LATIN SMALL LETTER P WITH STROKE THROUGH DESCENDER
Type:Brevigraph
Symbol: ꝑ
MUFI:https://bora.uib.no/handle/1956/10699, page 78
Name:LATIN SMALL LETTER P WITH FLOURISH
Type:Brevigraph
Symbol: ꝓ
MUFI:https://bora.uib.no/handle/1956/10699, page 78
Name:LATIN SMALL LETTER P WITH MACRON
Type:Brevigraph
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 79
Name:LATIN SMALL LETTER CON
Type:Brevigraph
Symbol: ꝯ
MUFI:https://bora.uib.no/handle/1956/10699, page 134
Name:LATIN SMALL LETTER LONG S WITH FLOURISH
Type:Brevigraph
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 89
Name:LATIN ABBREVIATION SIGN SMALL ET WITH STROKE
Type:Brevigraph
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 133
Name:L B BAR SYMBOL
Type:Brevigraph
Symbol: ℔
MUFI:https://bora.uib.no/handle/1956/10699, page 136
Name:LATIN SMALL LETTER V WITH DIAGONAL STROKE
Type:Brevigraph
Symbol: ꝟ
MUFI:https://bora.uib.no/handle/1956/10699, page 100
Name:PRESCRIPTION TAKE
Type:Brevigraph
Symbol: ℞
MUFI:https://bora.uib.no/handle/1956/10699, page 83
Name:LATIN ABBREVIATION SIGN EST
Type:Brevigraph
Symbol: ∻
MUFI:https://bora.uib.no/handle/1956/10699, page 134
Name:AMPERSAND
Type:Brevigraph
Symbol: &c
MUFI:https://bora.uib.no/handle/1956/10699, page 131

General abbreviation signs

Name:COMBINING OVERLINE
Type:Abbreviation mark
Symbol: ̅
MUFI:https://bora.uib.no/handle/1956/10699, page 129
Name:COMBINING COMMA ABOVE RIGHT
Type:Abbreviation mark
Symbol: ̕
MUFI:https://bora.uib.no/handle/1956/10699, page 130
Name:COMBINING CURLY BAR ABOVE
Type:Abbreviation mark
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 129
Name:LATIN SMALL LETTER IS
Type:Abbreviation mark
Symbol: ꝭ
MUFI:https://bora.uib.no/handle/1956/10699, page 135
Name:LATIN SMALL LETTER ET
Type:Abbreviation mark
Symbol: ꝫ
MUFI:https://bora.uib.no/handle/1956/10699, page 135
Name:LATIN SMALL LETTER L WITH BAR
Type:Abbreviation mark
Symbol: ƚ
MUFI:https://bora.uib.no/handle/1956/10699, page 58
Name:LATIN SMALL LETTER LL WITH BAR
Type:Abbreviation mark
Symbol: ƚƚ
MUFI:https://bora.uib.no/handle/1956/10699, page 58
Name:COMBINING ABBREVIATION MARK SUPERSCRIPT UR ROUND R FORM
Type:Abbreviation mark
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 131

Contractions

Name:GERMAN ABBREVIATION DAZ
Type:Contraction
Symbol: dʒ
MUFI:http://bit.ly/Schneider_Handschriftenkunde, page 90
Name:GERMAN ABBREVIATION WAZ
Type:Contraction
Symbol: wʒ
MUFI:http://bit.ly/Schneider_Handschriftenkunde, page 90

Punctuation

Name:DOUBLE OBLIQUE HYPHEN
Type:Pct
Symbol: ⸗
MUFI:https://bora.uib.no/handle/1956/10699, page 142
Name:THREE DOTS WITH COMMA POSITURA
Type:Pct
Symbol: 
MUFI:https://bora.uib.no/handle/1956/10699, page 148
Name:GEORGIAN PARAGRAPH SEPARATOR
Type:Pct
Symbol: ჻
MUFI:https://bora.uib.no/handle/1956/10699, page 139
Name:PILCROW SIGN
Type:Pct
Symbol: ¶
MUFI:https://bora.uib.no/handle/1956/10699, page 155
Name:PUNCTUATION MARK DOUBLE SOLIDUS
Type:Pct
Symbol: ⫽
MUFI:https://bora.uib.no/handle/1956/10699, page 140
Name:FRACTION SLASH
Type:Pct
Symbol: ⁄
MUFI:https://bora.uib.no/handle/1956/10699, page 140
Name:VERTICAL LINE
Type:Pct
Symbol: |
MUFI:https://bora.uib.no/handle/1956/10699, page 140

Digital presentation

The hyperdiplomatic transcription can be downloaded as TEI-XML for the individual collections.

In the synoptic presentation the source text is slightly normalized:

  • Capital and small letters are differentiated.
  • Dottless i/j and i/j with dot as superscript are not differentiated.
  • Diacritics are displayed.
  • Vocals used as superscripts are displayed.

Synopsis: This is a horizontal text / image synopsis where the transcription is aligned with the digital facsimile. This view can be used to simply scroll through the text.

Full-Width: This is a vertical text / image synopsis where the transcription is aligned with the digital facsimile. In this view the buttons of the image viewer have to be used to jump from one page to the next.

The macro structure of the source is represented: page and line breaks are displayed but column breaks are resolved into linear text according to the reading order.

Structuring text elements are displayed: headings, initials, rubrication (except vertical dashes and underlining).

Abbreviations are expanded. The expanded text is highlighted by italics and a different text color. A mouse-over function provides a detailed explanation.

Text revisions are highlighted (deletion: strike through plus blue text; addition: bold red text).

Uncertain readings are indicated by light grey text color.

Editorial reconstructions or insertions are indicated by square brackets. A mouse-over function provides a detailed explanation.

A pointing hand indicates peculiar passages. A mouse-over function provides a detailed explanation.

How to cite

H.W. Klug & A. Böhm (2020). Editorial Declaration. In H.W. Klug (Ed.), CoReMA - Cooking Recipes of the Middle Ages. Corpus - Analysis - Visualisation. With A. Böhm and C. Steiner. hdl.handle.net/11471/562.10.3 (GAMS. 562.10.3)