Home
Acknowledgements
Documentation
The corpus
People
Publications
Sponsors
References
Links
Appendices
|
Documentation:
document instance
Chapter 23 of
the TEI Guidelines deals with language corpora, and is therefore the
foundation on which the structure of the NECTE document instance is
built. There, a corpus is regarded as a ‘composite’ as opposed to a
‘unitary’ text the like novel (Guidelines 7),
and consists of a header followed by a sequence of TEI-conformant
XML texts. The NECTE corpus document file necte.xml correspondingly contains both the header
and the text sequence, that is, the sequence of interviews
comprising the NECTE corpus.
<teiCorpus.2> |
|
<teiHeader type='corpus'> |
</teiheader> |
|
&tlsg01; |
&tlsg22; |
&tlsn06; |
&tlsg02; |
&tlsg23; |
&tlsn07; |
&tlsg03; |
&tlsg24; |
&pvc01; |
&tlsg04; |
&tlsg25; |
&pvc02; |
&tlsg05; |
&tlsg26; |
&pvc03; |
&tlsg06; |
&tlsg27; |
&pvc04; |
&tlsg07; |
&tlsg28; |
&pvc05; |
&tlsg08; |
&tlsg29; |
&pvc06; |
&tlsg09; |
&tlsg30; |
&pvc07; |
&tlsg10; |
&tlsg31; |
&pvc08; |
&tlsg11; |
&tlsg32; |
&pvc09; |
&tlsg12; |
&tlsg33; |
&pvc10; |
&tlsg13; |
&tlsg34; |
&pvc11; |
&tlsg14; |
&tlsg35; |
&pvc12; |
&tlsg15; |
&tlsg36; |
&pvc13; |
&tlsg16; |
&tlsg37; |
&pvc14; |
&tlsg17; |
&tlsn01; |
&pvc15; |
&tlsg18; |
&tlsn02; |
&pvc16; |
&tlsg19; |
&tlsn03; |
&pvc17; |
&tlsg20; |
&tlsn04; |
&pvc18; |
&tlsg21; |
&tlsn05; |
|
|
|
</teiCorpus.2 |
|
The interview sequence is not lexically present in necte.xml.
Rather, necte.xml contains a list of references to entities defined by <!ENTITY % interviews SYSTEM
'interviews.ent'> %interviews; in the DOCTYPE declaration.
Each such entity reference denotes an operating system XML file that
contains a single interview; an XML processor understands this
denotation and processes the interview file as if it were lexically
present in necte.xml. The motivation for using entity
references in this way is to make the corpus modular and thus
more manageable than a single large file; the list of
interview entity references is shown in three columns to make
this page more compact.
-
<teiCorpus.2>
denotes a TEI-conformant corpus.
-
<teiHeader>
contains information that applies to all the constituent
interviews of
the corpus, and is in this sense global.
-
Each entity
reference in the sequence &tlsg; to &pvc18;
denotes a single constituent interview of the corpus.
|
|