Programme

The planned programme is below. 

Participants are requred to bring and use their own laptops. Some of the training is better if you can also install software, but the majority is web-based or has a web-based substitute. Where software is necessary to be installed, the trainers will advise on this on the day.

Each day runs from 9:00 - 17:00 with catered breaks and lunch. Some days conclude with a guest speaker. 


Monday 23 May 2022 -- An Introduction to Programming Fundamentals Using Python

On the first day, we will get down to basics: from what is an algorithm to functional programming, covering data types, loops and conditions. We will do this by introducing you to the basics of Python — a well-known and widely-used computer language — and demonstrate its potential for research in the humanities. We will also be available to troubleshoot any problems about getting Python to work on your machine.

  • Trainers: Jannetta Steyn, Tiago Sousa Garcia
  • Software: None — in browser, via Jupyter Notebooks, although if desired we will help setup Python on their machines
  • Guest Speaker: None.

Tuesday 24 May 2022 -- An Introduction to TEI Publisher

TEI Publisher, an open source, eXist-db based application using the TEI Processing Model and literate programming principles helps editors to smoothly navigate the gap between the encoded TEI XML sources and the published digital edition. It makes creation of standalone digital editions possible out of the box with a focus on standardization, sustainability and data exchange. 

The workshop will include an introduction to the TEI Publisher, basic concepts of the TEI Processing Model, and generating and customizing applications based on the TEI Publisher. Note: This is not an introductory workshop on the TEI Guidelines. 

  • Trainers: Magdalena Turska, Wolfgang Meier
  • Software: Workshop participants are encouraged to run TEI Publisher on their machines via Docker. Instructions how to install Docker can be found in the documentation. For those who don't wish to install anything on their computers a remote server access will be arranged for the duration of the workshop.
  • Guest Speaker: James Cummings, School of English Literature, Language, and Linguistics, Newcastle University - Using TEI Publisher for a student capstone dissertation module

Wednesday 25 May 2022 -- Transkribus for Handwritten Text Recognition

This workshop will introduce participants to the predominant user-facing handwritten text recognition (HTR) platform, Transkribus: a popular tool since its release for making historical documents more readable and accessible. Currently Transkribus has over 1,800 regular users, representing 80 institutions, and is regularly utilised in crowdsourcing projects on a range of collections, injecting historical transcription work with a degree of automation. The workshop will begin by providing background context to HTR, explaining how it relates to past technologies and innovations in transcription work. The main bulk of the session will have participants segment, edit and transcribe pages from the National Library of Scotland’s 19th century recipe book collections, before being shown how to train a HTR model. The session will close by showing participants how Transkribus is being utilised in research and what products are now possible leveraging automated transcription methods, notably through keyword spotting tools and the automated production of scholarly editions of texts. Direction and support will be given throughout, with Transkribus’s main functions requiring no prior experience or coding knowledge.

  • Trainers: Joseph Nockels
  • Software: Most of the session will use the web-based tool, though participants would benefit from downloading the Transkribus 'expert' client 
  • Guest Speaker: Alexandra Healey, Special Collections, Newcastle University - Experimenting with HTR in Special Collections and Archives

Thursday 26 May 2022 -- Data Science/Wrangling and Text Analysis

There’s an old saying that states that data science is 99% data wrangling, and that is precisely the focus of today. In the first half of the day, we will introduce you to the basic principles of data wrangling and teach you how to go from messy to clean data with ease, using Open Refine, a free open source tool to work with messy data. In the second half, we will discuss how to go from text to data, and how that move can help bring new insight into all sorts of literary or historical texts.

  • Trainers: Jannetta Steyn, Tiago Sousa Garcia
  • Software: Open Refine -- Download from https://openrefine.org/download.html  
  • Guest Speaker: Tiago Sousa Garcia, Research Software Engineering, Newcastle University - The Creativity Engine

Friday 27 May 2022 -- An Introduction to AI and Machine Learning

In this workshop we will talk about what Artificial Intelligence is, what it can do for you, and why you should care about it. We will discuss its conceptual framework and some of the differing approaches of various AI methods (though by no means all). We will then see what working with AI and machine learning means in practice by, for example, training a model to categorise images, or use AI to write text based on a prompt.

  • Trainers: Jannetta Steyn, Tiago Sousa Garcia
  • Software: Web-based, but it would be useful to have a Google account, as we will use Google Colab
  • Guest Speaker: Paul Watson, The National Innovation Centre for Data - The top 5 mistakes made when trying to exploit AI ... and how to avoid them