DECTE overview


DECTE is an amalgamation of an updated version of NECTE (the Newcastle Electronic Corpus of Tyneside English) and the subsequent NECTE2 collection.

NECTE, which was created between 2001 and 2005, digitized and combined sociolinguistic interviews from two earlier projects carried out at Newcastle University: the Tyneside Linguistic Survey (TLS) of the 1960s-1970s and the Phonological Variation and Change in Contemporary Spoken English (PVC) project of the 1990s.

NECTE2 extends the corpus into the present with further sets of interviews that have been collected annually since 2007 by students in the School of English Literature, Language, and Linguistics. DECTE thereby constitutes a rare example of a publicly available online corpus presenting dialect material spanning five decades.

The figure below illustrates how these components interrelate.

In total, DECTE currently contains 99 interviews, recording 160 speakers in 804,266 words of text and 71 hours 45 minutes and 43 seconds of audio.

Tables 1 and 2 summarize the number of interviews and informants contained in each of DECTE's three subcorpora and the corpus as a whole, and indicate how the speakers in each interview set break down by speaker sex and age.

Table 1. Summary of DECTE Interviews and Informants
  DECTE
Total
TLS
(1960s-1970s)
PVC
(1990s)
NECTE2
(2007-present)
Interviews 99 37 18 44
Informants 160 37 35 88
Female 87 20 18 49
Male 73 17 17 39
Age: 16-20 55 2 19 34
21-30 33 9 0 24
31-40 14 10 0 4
41-50 18 8 4 6
51-60 17 2 5 10
61-70 14 5 6 3
71-80 4 1 1 2
81-90 5 0 0 5
Table 2. DECTE Informants by Age and Speaker Sex
  DECTE
Total
TLS
(1960s-1970s)
PVC
(1990s)
NECTE2
(2007-present)
Age Female Male Female Male Female Male Female Male
16-20 36 19 2 0 10 9 24 10
21-30 16 17 4 5 0 0 12 12
31-40 7 7 6 4 0 0 1 3
41-50 11 7 5 3 2 2 4 2
51-60 9 8 2 0 4 1 3 7
61-70 2 12 0 5 2 4 0 3
71-80 1 3 1 0 0 1 0 2
81-90 5 0 0 0 0 0 5 0

DECTE is formatted in Text Encoding Initiative (TEI) conformant XML, using the current P5 TEI Guidelines.

TEI is one of several current document encoding standards, such as XCES and CLARIN, the aim of which is to render digital electronic language resources interoperable with one another and with application software.

We do, however, recognise that not all users will require the functionality offered by TEI, and that the extensive markup which it entails can be an obstacle to such users. Plain-text versions of the DECTE files are therefore provided; see the corpus files page for further details.