THE NEWCASTLE ELECTRONIC CORPUS OF TYNESIDE ENGLISH

Home

Acknowledgements

Documentation

The corpus

People

Publications

Sponsors

References

Links

Appendices

Documentation: global header

The global header <teiHeader > describes a TEI-encoded document, ‘so that the text itself, its source, its encoding, and its revisions are all thoroughly documented’ (Guidelines 5). This is in four main parts; <fileDesc> is compulsory, and the other three are optional; the NECTE document includes all four.

<teiheader>
 

<fileDesc></fileDesc>

<encodingDesc></encodingDesc>

<profileDesc></profileDesc>

<revisionDesc></revisionDesc>
 
</teiheader>

 where:

1. <fileDesc> gives ‘a full bibliographical description of the computer file itself, from which a user of the text could derive a proper bibliographic citation, or which a librarian or archivist could use in creating a catalogue entry recording its presence within a library or archive’ (Guidelines 5).  It contains the following subnodes, each of which itself contains subnodes:

a) <titleStmt>, which ‘groups information about the title of a work and those responsible for its intellectual content’ (Guidelines 5.2.1):

  • <title> the title of the corpus

  • <author> a list of those responsible for NECTE’s construction

  • <funder> the body or bodies that funded construction of the corpus

  • <principal> the name of the principal investigator

  • a series of <respStmt> nodes which describe the responsibilities of each person significantly involved construction of the corpus

b) <publicationStmt>, which ‘groups information concerning the publication or distribution of an electronic or other text’ (Guidelines 5.2.4):

  • <publisher>

  • <distributor>

  • <authority> details of who controls access to NECTE and how to obtain it

  • <availability> a list of user categories that, in the view of the NECTE team, have a legitimate interest in the corpus

c) <sourceDesc>, which ‘is used to record details of the source or sources from which a computer file was derived or generated’ (Guidelines 5.2.7).

  • <recordingStmt> ‘describes a set of recordings used in transcription of a spoken text’

  • <recording> is ‘used to provide a description of how and by whom a recording was made’

  • <equipment> gives ‘descriptive information related to the kind of recording equipment used’

2. <encodingDesc> ‘describes the relationship between an electronic text and its source or sources’ (Guidelines 5); ‘It specifies the methods and editorial principles which governed the transcription or encoding of the text in hand and may also include sets of coded definitions used by other components of the header’ (Guidelines 5.3), and contains the following subnodes:

  • <projectDesc> ’may be used to describe, in prose, the purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected’

  • <samplingDecl> ‘contains a prose description of the rationale and methods used in sampling texts in the creation of a corpus or collection

  • <editorialDecl> ‘provides details of editorial principles and practices applied during the encoding of a text’

3. <profileDesc> contains ‘classificatory and contextual information about the text, such as its subject matter, the situation in which it was produced, the individuals described by or participating in producing it, and so forth’ (Guidelines 5). It contains the following subnodes (Guidelines 5.4):

  • <particDesc> ‘describes the identifiable speakers, voices, or other participants in a linguistic interaction’

  • <settingDesc> ‘describes the setting or settings within which a language interaction takes place, either as a prose description or as a series of setting elements’

4. <revisionDesc>: ‘...allows the encoder to provide a history of changes made during the development of the electronic text’ and ’provides a detailed change log in which each change made to a text may be recorded’ (Guidelines 5). It contains the following subnodes (Guidelines 5.5):

  • <change> ‘summarizes a particular change or correction made to a particular version of an electronic text’

  • <date>: ‘contains a date in any format’

  • <respStmt> ‘supplies a statement of responsibility for someone responsible for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors etc do not suffice or do not apply’.