DEPCHA - Digital Edition Publishing Cooperative for Historical Accounts

Alpha-Version

How to Semantic Bookkeep

TEI/XML

The following snippet of a historical accounting book (image and transcription) is used as an example to illustrate a possible way of annotating a digital edition using TEI/XML and its connection the Bookkeeping Ontology.

GAMS Workflow

Here you see the transcription of the given example:

Stagville August 1st 1808
69 James Haley Self Dr.
1/4 lb Powder 2/6 1 lb Shot 2/6 1 lb Sugar 2/676
1 Shoe Knife 1/616
9-

This structure in the source describes two bk:Transactions with their textual representation (bk:Entry). If something in this tutorial begins with the prefix bk: it means, that it is a concept defined in the Bookkeeping Ontology. It can be expanded to http://gams.uni-graz.at/o:depcha.bookkeeping#, for instance via a TEI-prefix definition tei:prefixDef.

  • 1/4 lb Powder 2/6 1 lb Shot 2/6 1 lb Sugar 2/6 | 7 | 6
  • 1 Shoe Knife 1/6 | 1 | 6

Having a look at the first bk:Entry we can summarize the interpretation of it as the following:
The person James Haley buys 1/4 lb Powder for the price of 2/6 and he further buys 1 lb Shot for the price of 2/6 and at last he buys 1 lb Sugar for the price of 2/6 from Stagville. Stagville represents a farm (plantage). James Haley transfers therefore the monetary value of 7 shilling and 6 pence to Stagville. Some information is not explicit inside a bk:Entry and can be found in the header or is known by the domain expert of the source.
In this case a table seems to describe the text structure the best way and the following TEI can be an example for that:


<text>
  <body>
     <head>Stagville August 1st 1808</head>
 <table>
	<head>69 James Haley J Self Dr.</head>
  <row>
    <cell>1/4 lb Powder 2/6 1 lb Shot 2/6 1 lb Sugar 2/6</cell>
    <cell/>
    <cell>7</cell>
    <cell>6</cell>
  </row>
  <row>
      <cell>Shoe Knife 1/6</cell>
    <cell/>
    <cell>1</cell>
    <cell>6</cell>
  </row>
  <row>
      <cell/>
    <cell/>
    <cell>9</cell>
    <cell>-</cell>
  </row>
</table>
  </body>
</text>
</TEI>

You can use the full expressiveness TEI. Is offering to describe text. For the Bookkeeping Ontology only some annotations using the TEI attribute ana are relevant, next to some attributes for formalizing data. We will look at that in more detail now. The starting point is to define a container for a bk:Transaction by annotating the textual representation of an bk:Entry like div, row or span (or any other element that fits to your data and the TEI guidlines). Using @ana with the value "bk:entry" a TEI structure is defined as an bk:Entry according to the Bookkeeping Ontology.


<row ana="bk:entry">
	<cell>1/4 lb Powder 2/6 1 lb Shot 2/6 1 lb Sugar 2/6</cell>
	<cell/>
	<cell>7</cell>
	<cell>6</cell>
</row>

Commodities, Services and Monetary Values

Now we can define inside the row element (bk:Entry) further concepts which describe a bk:Transaction . A bk:Transaction consists of at least one bk:Transfer. A bk:Transfer defines the transfer of a bk:Commodity, bk:Service or bk:MonetaryValue bk:from one bk:Between (abstract class for groups, persons, organizations, accounts etc.) bk:to another. We first deal with the question: "what is transferd?". In this example three different bk:Commodity are mentioned:

  • 1/4 lb Powder
  • 1 lb Shot
  • 1 lb Sugar

As well as two bk:MonetaryValue:

  • 7 shilling
  • 6 pence

Whereas three further bk:MonetaryValues can be found, interpreted as bk:Price. The bk:Price belongs to the bk:Commodity named before: ¼ lb Powder for 2 shilling and 6 pence. By the way: 1 shilling (s) = 12 pence (d)

The measure element in TEI allows to annotated countable entities using the attributes @unit, @commodity and @quantity. So we can annotated them and normalize the data using the attributes. To bring together bk:Commodity and bk:Price we put them together in one measure and mark the bk:Price explicitly in an additional measure element.


<row ana="bk:entry">
	<cell>
		<measure commodity="powder" quantity="0.25" unit="lb">
			1/4 lb Powder 
			<measure quantity="2" unit="shilling">2</measure>
			/
			<measure quantity="6" unit="pence">6</measure>
		</measure>
		<measure commodity="shot" quantity="1" unit="lb">
			1 lb Shot 
			<measure quantity="2" unit="shilling">2</measure>
			/
			<measure quantity="6" unit="pence">6</measure>
		</measure>
		<measure commodity="sugar" quantity="1" unit="lb">
			1 lb Sugar 
			<measure quantity="2" unit="shilling">2</measure>
			/
			<measure quantity="6" unit="pence">6</measure>
		</measure>
	</cell>
	<cell/>
	<cell><measure>7</measure></cell>
	<cell><measure>6</measure></cell>
</row>

Now we reference the measure to the Bookkeeping Ontology using @ana. We want to say that powder, sugar and shot are commodities with the price each 2 shilling and 6 pence. Therefor we add ana="bk:commodity", ana="bk:money" and ana="bk:price" to the elements.


<row ana=“bk:entry“>
	<cell>
		<measure ana=”bk:commodity”  commodity=”powder” quantity=”0.25” unit=”lb”>
		  1/4 lb Powder 
		  <measure ana=”bk:price” quantity=”2” unit=”shilling”>2</measure>
		  /
		  <measure ana=”bk:price” quantity=”6” unit=”pence”>6</measure>
		</measure>
		<measure ana=”bk:commodity” commodity=”shot” quantity=”1” unit=”lb”>
		  1 lb Shot 
		  <measure ana=”bk:price”  quantity=”2” unit=”shilling”>2</measure>
		  /
		  <measure ana=”bk:price” quantity=”6” unit=”pence”>6</measure>
		</measure>
		<measure ana=”bk:commodity” commodity=”sugar” quantity=”1” unit=”lb”>
		  1 lb Sugar 
		  <measure ana=”bk:price” quantity=”2” unit=”shilling”>2</measure>
		  /
		  <measure ana=”bk:price” quantity=”6” unit=”pence”>6</measure>
		</measure>
	</cell>
	<cell/>
	<cell><measure ana=”bk:money” quantity=”7” unit=”shilling”>7</measure></cell>
	<cell><measure ana=”bk:money” quantity=”6” unit=”pence”>6</measure></cell>
</row>

This very bk:Transaction consists now of 2 bk:Transfers. One transferring powder, sugar and shot, the other bk:Transfers transfers 7 shilling and 6 pence.

Flow of Measurable: from - to

In a next step we want to define the directions of the flow of moneys and goods which can be summarized as the following:

  • powder, sugar, shot from Stagville to James Haley
  • 7 shilling, 6 pence from James Haley to Stagville
Therefore we use ana="bk:from" and ana="bk:to". We add this kind of annotation to any TEI structure you are using for annotating agents in your document like name, persName or orgName. This can also be an unknown group of people like “4 farmers pay a 5 $ tax”. For such cases you can use an element like span with an additional ana="bk:group". As for the RDF representation URI’s are needed, we have to assign a unique ID for every agent (even for a unknown group) using @ref.
Additionally to that we have to make the direction of the transfer explicit by extending the @ana of the measure. Putting bk:commodity, bk:service or bk:money together with bk:from or bk:to in the same @ana-attribute indicates the receiver of the respective economic good as indicated in the close context of the annotation. Therefore, a transaction recorded in an account of the Stagville company identifies one party in the every transaction already in the heading:


<head><name ana="bk:to" ref="#between.1">Stagville</name> August 1st 1808</head>
              

The other parties can occur in each single entry:


<p ana="bk:entry">paid to <name ana="bk:from" ref="#between.2">James Haley</name> for
   <measure ana="bk:service" commodity="labor" unit="day" quantity="3">3 days woodcutting</measure>
   <measure ana="bk:money" quantity="6" unit="dollar">6 d.</measure></p>
              

To make clear, which party is receiving the money, the encoding with @ana="bk:money" has to carry the appropriate bk:from / bk:to:


<p ana="bk:entry">paid to <name ana="bk:from" ref="#between.2">James Haley</name> for 
  <measure ana="bk:service bk:to" commodity="labor" unit="day" quantity="3">3 days woodcutting</measure> 
  <measure ana="bk:money bk:from" quantity="6" unit="dollar">6 d.</measure>
</p>
            

The transfer of a monetary value is always encoded explicitely in the DEPCHA.

All of this can lead, for instance, to the following encoding, in which the parties in the transaction are both encoded external to the single entry:


<body>
<head><name ana="bk:to" ref="#between.1">Stagville</name> August 1st 1808</head>
<table>
	<head>69 <name ana="bk:from" ref="#between.2">James Haley</name> J Self Dr.</head>
	<row ana="bk:entry">
		<cell>
			<measure ana="bk:commodity bk:from" commodity="powder" quantity="0.25" unit="lb"> 1/4 lb Powder <measure ana="bk:price" quantity="2" unit="shilling">2</measure> / <measure
					ana="bk:price" quantity="6" unit="pence">6</measure>
			</measure>
			<measure ana="bk:commodity bk:from" commodity="shot" quantity="1" unit="lb"> 1 lb Shot <measure ana="bk:price" quantity="2" unit="shilling">2</measure> / <measure ana="bk:price"
					quantity="6" unit="pence">6</measure>
			</measure>
			<measure ana="bk:commodity bk:from" commodity="sugar" quantity="1" unit="lb"> 1 lb Sugar <measure ana="bk:price" quantity="2" unit="shilling">2</measure> / <measure ana="bk:price"
					quantity="6" unit="pence">6</measure>
			</measure>
		</cell>
		<cell/>
		<cell><measure ana="bk:money bk:to" quantity="7" unit="shilling">7</measure></cell>
		<cell><measure ana="bk:money bk:to" quantity="6" unit="pence">6</measure></cell>
	</row>
</table>
</body>

Date and Debit/Credit

Further information we may want to add to this bk:Transaction is the date and the booking with debit or credit. Using the TEI date element with @when like @ana=”bk:when” adds the date of transaction to the entry. To add the information of booking on credit or debit you can use a TEI span with a @ana="bk:debit" or @ana="bk:credit":


<body>
<head>
	<name ana="bk:to" ref="#between.1">Stagville</name> 
	<date ana="bk:when" when="1808-08-01">August 1st 1808</date>
</head>
<table>
<head>
	69 <name ana="bk:from" ref="#between.2">James Haley</name> J Self <span ana="bk:debit">Dr.</span>
</head>
		...

Transaction Status and Agent

Here is another example of a transcription of a bk:Entry in the Wheaton day book.

Monday September 15th 1828
settledDerius Drake D To ax dlvd By Puffer 9/1By cutting two sticks1596
Two further entities may be relevant for editing an account: 'what is the transaction status?' and 'are other people involved in the transaction?' The bk:Entry of the next example describes that a person named Derius Drake worked for Wheaton (owner of the shop and main author of the day book) and payed with this work his debt of 1 $ 50 cents "by cutting two sticks". Darius got an axe to do the job. This axe comes from the bk:Agent Puffer.

In this example the status of the transaction is "settled", which means it is paid. The second entity is the involvement of a third person. In this example the text “By Puffer” can be interpreted as somebody who is carrying out the transaction for somebody else. Adding @ana="bk:agent" annotated a the person as a bk:Agent.


<table>
<head>Monday <date ana="bk:when" when="1828-09-15">September-15th-1828</date></head>
<row>
	<cell><span ana="bk:status">settled</span></cell>
	<cell>
		<name ana="bk:to" ref="#pers_WCDH276">Derius Drake</name> 
		<span ana="bk:debit">D </span>To ax dlvd By 
		<name ana="bk:agent" ref="#pers_WCDH280">Puffer</name> 9/1
	</cell>
	<cell><measure ana="bk:money bk:to" commodity="currency" quantity="1" unit="dollars">1</measure></cell>
	<cell><measure ana="bk:money bk:to" commodity="currency" quantity="50" unit="cents">50</measure></cell>
	<cell/>
</row>
</table>

Group, Taxes and Places

In the next example - a tax list - we find unknown groups that are taxable in a certain village. The bk:Entry represents that at the place Marczinkowo two craftsmen have to pay a tax in the amount of 3 floren and 14 grosz.

[784r]Parochia Iunivladislaviensis 1565
Marczinkowode quinque mansis possessis, et duobus artificibus solvit314
The Bookkeeping Ontology allows to describe this entities by using @ana="bk:where", @ana="bk:tax" and @ana="bk:group". The addition bk:from i ana="bk:money bk:from" defines that money goes away from the group and is booked on the account of the main author of the tax list (e.g. the king).


<row ana="bk:entry">
<cell>
	<placeName ref="#village.3" xml:id="SID.4" ana="bk:where">Marczinkowo</placeName>,
	de <name ana="bk:from bk:tax" ref="#C000004">quinque mansis possessis</name>,
	et <span ana="bk:from bk:group">duobus artificibus</span>
	solvit
</cell>
<cell>
	<measure ana="bk:money bk:from" quantity="3" unit="floren">3</measure>
</cell>
<cell>
	<measure ana="bk:money bk:from" quantity="14" unit="grosz">14</measure>
</cell>
</row>

XML-Schema

Here is a very restrictiv Bookkeeping Ontology XML-Schema based on all current ingested data. You can associat this schema in your Oxygen Editor and the editor offers you elements and annotations according to the Ontology.

Linked Open Data

DEPCHA tries to offer its data in the LOD cloud. Therefore you can add Wikidata concepts directly into the attributes or you define your vocabulary in the TEI header in form of a taxonomy. In the following TEI example and one figure are given to explain that.


<body>
<head><name ana="bk:to" ref="#between.1">Stagville</name> August 1st 1808</head>
<table>
	<head>69 <name ana="bk:from" ref="#between.2">James Haley</name> J Self Dr.</head>
	<row ana="bk:entry">
		<cell>
			<measure ana="bk:commodity bk:from" commodity="wiki:Q11002" quantity="1" unit="wiki:Q100995"> 1 lb Sugar <measure ana="bk:price" quantity="2" unit="wiki:Q213142">2</measure> / <measure ana="bk:price"
					quantity="6" unit="wiki:Q234129">6</measure>
			</measure>
		</cell>
		<cell/>
		<cell><measure ana="bk:money bk:to" quantity="7" unit="wiki:Q213142">7</measure></cell>
		<cell><measure ana="bk:money bk:to" quantity="6" unit="wiki:Q234129">6</measure></cell>
	</row>
</table>
</body>
GAMS Workflow

In this example the editor as defined is own concepts in the TEI header and links them to it.

CSV

Having data in CSV or Excel is a way to work with the data (and not the transciption). In DEPCHA you have to transform your CSV into a simple TEI/XML for standardisation purpose. You can use this XSLT to do so.
Your CSV has to look like this example. Check your EXCEL export for that and use a CSV validator.

The '|' is used to seperate multiple values inside a cell.

RDF

The last option is to directly create RDF compliant with the bookkeeping ontology. You can use Protege to check the consistency of your RDF with the ontology.

How to Ingest Data into GAMS

To start a new ingest process, choose 'File' and 'Ingest objects'. The dialogue box allows you to configure mass ingest scenarios for cirilo:TEI objects. Choose the TEI content model type and click either 'From filesystem' (which opens the directory selection dialogue), 'From eXist' (which allows you to connect to an eXist database). All files from the selected directory (including also all sub-directories) or database will be imported. Files are validated upon ingest, i.e. if the source document is not valid, the ingest process fails and an error message occurs. If you use the ingest function, it is highly recommended to specify your PIDs in the source documents using the element <idno @type="PID">. Please note, that if two documents specify the same PID, the first one ingested will be updated with the content of the second one for all content models except for the cirilo:Ontology model for RDF. If more than one RDF document specifies the same PID value in the @rdf:resource of the <void:Dataset> the content of the second one will be added to the first one and will not replace it!

The DC metadata specified in the ingest dialogue box will be copied for all ingested objects. A log file with details on the process is provided by the client; Either click 'Show log' or open it from the ingest source directory in your filesystem, where it is saved. The ingest of objects can also be simulated. This option is used to detect possible errors in the log files, before actually starting huge mass ingest operations. If you want to run a simulation, tick the corresponding box.

GAMS Workflow