The corpus







Appendix 4: Compliance Statement of NECTE (The Newcastle Electronic Corpus of Tyneside English) with the United Kingdom Data Protection Act 1998

NECTE members Will Allen (post-doctoral research associate), Joan Beal (co-investigator), Karen Corrigan (principal investigator), Warren Maguire (post-doctoral research associate), Hermann Moisl (co-investigator), and Charley Rowe (post-doctoral research associate) have reviewed the University of Newcastle upon Tyne's advice to staff on the Data Protection Act 1998 (DPA), and conducted their work on the NECTE project accordingly.  In what follows, we address those areas of the DPA that have particular relevance to our project’s use of data.


  • Data controllers: Joan Beal, Karen Corrigan, Hermann Moisl

  • Data processors: Will Allen, Warren Maguire, Charley Rowe


In 1969 and 1994, two sets of interviews were conducted for dialect research purposes by former academic and graduate researchers at the University of Newcastle upon Tyne.  The NECTE project seeks to enhance these corpora by combining them and making the resultant amalgamated corpus accessible via World Wide Web, thus extending its usability. These materials are further described elsewhere on this website

Compliance with the DPA

In the 1969 study, subjects:

(1)  Were informed orally and in writing about the purpose of the research (dialect preservation and study).

(2) Were aware that they were being recorded on tape, and consented to the recording process.

The written agreement given to data subjects in 1969 states that:  “The results of the [1969] survey will in due course be published, but no resident who has helped by talking in this way will be referred to in such a way that they could be identified” and was signed by Barbara Strang, Professor of English Language and General Linguistics, University of Newcastle upon Tyne (1969).  

In the 1994 study, subjects: 

(1) Were informed orally about the purpose of the research (dialect preservation and study).

(2) Were aware that they were being recorded on tape, and consented to the recording process.

After consent was given, the interviewer gave the subjects explicit information about the recording process, switched on the recording equipment, which was prominently placed before the subjects, and allowed the subjects to converse freely among themselves while their speech was recorded on tape.

Interpretation and compliance with the DPA

Part I of the DPA defines personal data as those data that “relate to a living individual who can be identified from those data”.  Schedule I, Part I, Principle 1 of the DPA stipulates that personal data may be processed only if “(a) at least one of the conditions in Schedule 2 is met, and (b) in the case of sensitive personal data, at least one of the conditions in Schedule 3 is also met.” We have met condition 1 of Schedule 2, i.e. The data subjects had full knowledge (presented orally and in writing) that the data would be processed for linguistic research. We have also met conditions 2 and 6 of Schedule 2, namely, processing is necessary for compliance with Arts and Humanties Research Council (AHRC) guidelines as set out in their Resource Enhancement grant awarded to NECTE (grant number RE/AN6422/APN11776) - (condition 2, Schedule 2); processing in this case constitutes the “legitimate interests” of the data controller and related third parties (i.e., the NECTE team) - (condition 6, Schedule 2). We have additionally met condition 2 of Schedule 3 in full (“the processing is necessary for the purposes of exercising or performing any right or obligation conferred on the data controller”).  The AHRC grant awarded to the data controllers Karen Corrigan, Joan Beal, and Hermann Moisl, provides that the data must be processed for the sake of preservation. We have, therefore, also met condition 3 of Schedule 3 in full.  Our data are processed “in the course of legitimate activities” of NECTE, the processing is not conducted for profit and exists only for philosophical purposes. Moreover, it is “carried out with appropriate safeguards” (as described below) since the processing “relates only to” NECTE and its regular associates, and “does not involve disclosure of the personal data” to third parties, where the personal data positively identifies the data subjects. In accordance with Principle 2, personal data was collected expressly for dialect research and preservation purposes, and continues to be used solely for these purposes. In accordance with Principle 3, personal data are adequate, relevant, and not excessive for dialectological research purposes. In accordance with Principle 4, personal data are accurate and, where necessary, up to date. In accordance with Principle 5, personal data are kept indefinitely only because of exemption (3) in section IV (the publication of data is “in the public interest” as identified by the AHRC). In accordance with Principle 6, personal data are processed in accordance with the rights of data subjects under the DPA. In accordance with Principle 7, appropriate technical and organizational measures are taken against unauthorized processing and against destruction of personal data. The NECTE website holds no positively identifying personal data, and access to the anonymous data it holds is restricted by password.  Original data are stored in a safe, on two password-restricted computers, and on a computer in a locked archive with access restricted to the NECTE research team and legitimate associated scholars. In accordance with Principle 8, personal data are not transferred outside these repositories except where allowed by the DPA.

Part II of the DPA addresses rights of data subjects. NECTE is prepared to supply data subjects with detailed information about its use of their personal data should they make a written request for this information.

Part III of the DPA is relevant to NECTE if the data controllers Karen Corrigan, Joan Beal, and Hermann Moisl are required to give notification to the Information Commissioner.    

Part IV of the DPA states that “personal data which are processed only for the special purposes are exempt from any provision to which this subsection relates if the data controller reasonably believes that, having regard in particular to the special importance of the public interest in freedom of expression, publication would be in the public interest.”  Because NECTE is a resource enhancement project for preservation purposes, NECTE and the AHRC regard the data as being in the public interest. Moreover, we are in compliance with section 33, which provides that we may process data for research purposes if  “the data are not processed in such a way that substantial damage or substantial distress is, or is likely to be, caused to any data subject.”  Because subjects cannot be identified in the published record, no damage or distress may befall the data subjects as a result of the published record. Some of the data subjects of our study are still living, and they can be identified only in connection with a single "id" data file, which is maintained separately from other data files and is not made public. The purpose of this “id” file is to correctly match data for data management purposes, and is not used in connection with research, publishing, or public distribution.  The “id.” file is accessed only by the three Data Controllers and the three Data Processors identified above.

Finally, because NECTE uses the data for research purposes, it may “be kept indefinitely.”