Introduction to Stylometry Workshop

Newcastle University, 11-12 April 2019, 9am - 5pm

This two-day workshop will introduce participants to the field of stylometry. An introductory lecture shows the main tenets, methods and achievements (and failures) of the field, together with examples of research in authorship attribution and distant reading. Much of the work will be focused on the stylometric signal in translation. In the following hands-on workshop, the participants will be acquainted with stylo, a package for the statistical programming environment R co-written by the instructor. This package is a way to avoid R’s steep learning curve so that humanists can easily perform advanced quantitative analyses of texts. While stylo has its own built-in visualization tools, the second part of the workshop will also introduce gephi, a piece of network analysis software. Finally, the participants will be challenged to perform their first own analyses on their own collections of texts or on those provided for them. No programming skills are required!


Registration and Venue

The workshop is provided free of charge thanks to the Newcastle University Humanities Research Institute (NUHRI). In order to attend the workshop you must register using this form:  

The venue will be the Percy Building at Newcastle University. No food or refreshments are provided, but we can direct you to nearby places on and off campus for breaks and lunch.



Participants are required to use their own laptops (and have access to install software). If you do not have a laptop and wish to borrow one during the workshop, please indicate this on the registration form (we have a very small number of these). If you want to get a head start before the workshop you may download and install R and Gephi (and check if they are functioning correctly on your computer).

  1. R:

  2. (Mac/OSX  users only): XQuartz:

  3. Gephi:

  4. Download sample text collection for first analysis:!AjWxtkrEXCa7hPVFz9Aw1AKGsNrvkA

  5. Download a slideshow of more detailed instructions that will be used during the workshop:!AjWxtkrEXCa7h9wUQH8pOSaU1sYErA


If the participants plan to try out the new methods on their own texts, these should be in plain text (.txt) format, UTF-8 encoded. Preferably, the file names should follow the pattern: author_title_date.txt (keep the underscores). It makes sense to bring texts by at least five authors, at least two texts each (from short story to novel or full piece of drama). And yes, stylo does work with not-Latin alphabets (as long as they’re UTF-8-encoded)!


Suggested further reading

Rybicki, J., Eder, M., Hoover, D. "Computational Stylistics and Text Analysis." In Doing Digital Humanities. Practice, Training, Research. Eds Crompton, C., Lane, R.J., Siemens, R. Oxford: Routledge, 2016, 123-144. A preprint version can be sent to participants on request.

A number of preprint versions of stylometric papers is available at The following might be of particular interest:


About the instructor

Jan Rybicki is Assistant Professor at the Jagiellonian University in Kraków, Poland. He has written extensively on the application of quantitative methods in the study of literature, tracing the stylometric signals of authors, translators, genres and genders in literary texts in several languages. Together with Maciej Eder and Mike Kestemont, he is a co-author of the “stylo” package for R, which has become a well-known tool of stylometric analysis. He is also an active literary translator; he has translated some 30 novels from English to Polish by such authors as John le Carre, Kazuo Ishiguro or William Golding.

