Course Description

Download a preliminary schedule for the summer school here (PDF, 33KB, last updated 19 July 2019). 

DAY 1: Digital Bootcamp

Local team (University of Antwerp)

The first day of the summer school will take the form of an informal bootcamp, which aims to introduce the participants to a number of basic skills that lie at the basis of the more advanced technologies which will be covered during the rest of the week. The tutorial sessions will cover general topics, such as working with a command line interface (bash) or basic data types (lists, integers, dictionaries, etc) and file formats (txt, csv, json), which we will introduce using Jupyter Notebooks for the accessible Python programming language. A targeted session will be dedicated to the file format of XML, in particular TEI-XML and ALTO-XML, as these are central standards in image processing in relation to text analysis. Finally, we will cover the basics of modern web serving technologies, including high-level concepts, such as the HTTP protocol. The teaching team will be a local delegation of Antwerp's ACDC group which has ample tutoring experience. The day will be wrapped up by a keynote by Dries Moreels on IIIF that will put the workshops of the next two days in context.

DAY 2: Introducing IIIF

Dries Moreels (Ghent University)

[Abstract forthcoming]

DAY 3: Intermediary IIIF

Dries Moreels (Ghent University)

[Abstract forthcoming]

DAY 4: Computer Vision

Mike Kestemont (UAntwerpen) & Thomas Smits(KB The Netherlands)

After two days of learning how to distribute and host images using IIIF, this fourth day will teach participants more about how these images can be used for further research. Several scholars have observed that Digital Humanities research mainly focuses on the analysis of text (Champion, 2017; Meeks, 2013). Consequently, visual sources, such as illustrations, cartoons, and photographs, have often been overlooked. This omission can, to a large extent, be explained by the lack of suitable techniques to study visual material computationally. However, in recent years, computational methods have become available that can enable scholars to analyze large digitized visual datasets in innovative ways. 

The workshop ‘Computer Vision’, which highlights some of these methods, consists of four parts. First, we will give an introduction to Computer Vision using neural networks. Second, we will present how we applied neural networks as researchers-in-residence at the National Library of the Netherlands. These presentations are followed by two hands-on sessions, during which we will use Jupyter Notebooks to work with several Computer Vision techniques, such as convolutions, face detection and generation, and object classification. We will end the day with a discussion on possible projects that could benefit from computer vision techniques.

DAY 5: Handwritten Text Recognition – Transkribus

READ consortium members from Innsbruck, Rostock, and Valencia 

Finally, on the last day of the summer school, an international team of instructors will teach participants how to work with a specific application of computer vision technologies: Transkribus, a comprehensive platform for the the automated recognition, transcription and searching of historical documents. Transkribus is the flagship project of the READ consortium that has developed a large network of researchers working on the Recognition and Enrichment of Archival Documents. Their platform offers a wide range of users (scholars, archives, volunteers, computer scientists) a free workspace for transcribing handwritten documents with the help of state of the art Handwritten Text Recognition (HTR) algorithms.

After an introduction to (and hands-on training session with) the transcription platform itself, READ partners from the University of Rostock and the University of Valencia will demonstrate tools that they have developed for integration into the Transkirbus platform that make the transcription process even more efficient: respectively tools for a) mapping pre-existing transcriptions onto images to jumpstart the transcription algorithm; and b) keyword spotting, a new and powerful searching tool that lets the user search for distinct word shapes in your their document collection to facilitate their transcription. This day will end with a closing keynote by Veronica Romero Gomez (University of Valencia) on human-computer interaction in Digital Humanities with a focus on image processing that will use CATTI – a tool for interactive-predictive automated text recognition – as a case study.

 

 

Daily Schedule

09:30-11:00: Session 1
11:00-11:30: Coffee break A
11:30-13:00: Session 2
13:00-14:00: Lunch
14:00-15:30: Session 3
15:30-16:00: Coffee break B
16:00-17:00: Session 4 / Keynote slot