Corpus structure


The Diachronic Electronic Corpus of Tyneside English is an XML document that conforms to the P5 version of the Text Encoding Initiative (TEI) Guidelines (see the note on TEI conformance here).

This part of the DECTE website describes its structure; familiarity with XML and TEI is assumed throughout. References to 'TEI Guidelines' in what follows are to the online TEI P5 Guidelines, and, unless otherwise indicated, quotations are from the specified sections of these guidelines.

To be TEI-conformant, an XML document has to be validated relative to a schema that is consistent with the published TEI Guidelines (TEI Guidelines, 23.4.2), which implies that only the TEI-defined XML tag set and tag syntax are used in the document. DECTE uses the XML Document Type Definition (DTD) schema language and has been validated using the oXygen XML editor.

The file decte.xml is the main DECTE file. It specifies the structure of the corpus, and has the following three components: