Understanding ideological bias through data-driven methods: testing cognitive social learning processes through intersectional analysis of past data (c.1800-c.1940) 01/01/2021 - 31/12/2024

Abstract

Ideological bias concerning age, gender, ethnicity and social class is a major ethical concern in contemporary society, influencing human behaviour both at macro- and micro-levels. Recent studies have demonstrated that machine learning methods (from artificial intelligence) not only capture, but amplify the ideological biases in the data they are trained on. In this project, we aim to strategically turn this undesirable property to our advantage and exploit the study of ideological biases for visual cultures in the nineteenth and early twentieth centuries (c.1800-c.1940). Recent cognitive studies make clear how ideological biases largely result from processes of social learning. To study the construction and dissemination of ideological bias we put forth three case studies in crucial areas of social control: education (children's literature), mass communication (magic lantern slides and performances), and regulation (police reports). These interlinked areas of study come with a wealth of rights-free digitized material and pre-existing scholarship. Through the application of standard routines from machine learning, we aim to elicit implicit patterns and trends relating to ideological bias and confront these with received knowledge. The current project is innovative in its methodology through its study of pixel data through computer vision in the humanities which has received too little attention so far. Moreover, it uses data-driven technology to present a novel intersectional viewpoint on the construction of ideological bias in the past. Finally, by being embedded in recent cognitive studies, the project will be able to make claims on how implicit bias functioned in the past, understanding better what people thought and how such thinking structured behavioural interactions with their surrounding world.

Researcher(s)

Research team(s)

Silent voices: A Digital Study of the Herne Charterhouse as a Textual Community (ca. 1350-1400). 01/01/2020 - 31/12/2023

Abstract

The Carthusian monastery of Herne has had a profound impact on the cultural history of the Low Countries, as a true hotspot in the production, negotiation and dissemination of vernacular literature for lay audiences, in a time where most written texts were still in Latin. In a short time span (ca. 1350-1400), the members of the community collectively copied a fantastic collection of 25+ Middle Dutch and Latin manuscripts, many of which contain unique texts. The Herne monks, who took a monastic oath of silence, were unusually productive and modest scribes, as suggested by the remarkable lack of self-attributions in their material. It is somewhat anachronistic therefore that recent literary scholarship has almost exclusively focused on an elusive search for the identification of specific individuals in the monastery (such as the famous Bible translator of 1360). In this project, we propose to study the charterhouse as a tight textual community, driven by a shared goal. To this end, we will focus on the scribal practice in the monastery, as a privileged gateway into the collaborations between the monks. Using stylochronometry we will study the evolution of the copying practice of the individual scribes and convergences therein. Because a significant share of these manuscripts are still inaccessible to the scholarly community, we will apply handwritten text recognition to produce diplomatic transcriptions that scholars can search, analyze and edit further.

Researcher(s)

Research team(s)

Digital Heritage for Smart Regions (Time Machine). Test-case: Herentals and the Kleine Nete. 01/10/2019 - 30/09/2021

Abstract

How can we unlock the Wisdom of the Past to answer spatial challenges today? The Digital Revolution is producing massive amounts of digital and digitized historical and archaeological data, which can be located with different degrees of precision in the landscape. Once integrated in a Geographical Information System (GIS), these data can be turned into a digital 'Time Machine'. In this project, funded by the Province of Antwerp, and framed in the scientific collaboration between the Province and the University of Antwerp, we test the potential of Time Machine technologies on the Herentals-Kleine Nete region, more specifically adressing the question of the historical land-use and water management of the river wetlands along the river Kleine Nete. If successfull, the project will result in A) an integrated methodology for the use of digital and digitized data in landscape history and archaeolgy; B) new insights in the history and evolution of valuable river wetlands and C) suggestions for the valorization of this knowledge in ecosystem management, tourism, agriculture and landscape development.

Researcher(s)

Research team(s)

CLARIAH-VL: Open Humanities Service Infrastructure. 01/02/2019 - 31/01/2021

Abstract

CLARIAH-VL: Open Humanities Service Infrastructure is the Flemish contribution to the European DARIAH and CLARIN infrastructures. It brings together and extends the portfolio of services enabling digital scholarship in the Arts and Humanities offered by the DARIAH-VL Virtual Research Environment Service Infrastructure (VRE-SI; Hercules & FWO 2015-2018) with the digital tools and language data that are offered through CLARIN-DLU/Flanders. The consortium which includes the network of Digital Humanities Research Centres at the universities of Antwerp, Brussels, Ghent and Leuven has been extended with the Dutch Language Institute (INT) – the CLARIN-ERIC certified B-Centre for Flanders. CLARIAH-VL will implement a modular research infrastructure embedding high-quality, user-friendly tools and resources into the workflows of humanities researchers in the five focus areas of linguistics; literature; socio-economic history; media studies; ancient history and archaeology. CLARIAH-VL aims to provide sustainable services, while fostering experimental development and innovation. Offering an open infrastructure which facilitates public humanities is a guiding principle for CLARIAH-VL. It will ensure the accessibility and relevance of the humanities to the general public, specific (heritage) community groups and policy makers. It will make it technically possible to share knowledge, including sharing and co-creating knowledge with non-specialist users, such as facilitating citizen science and crowdsourcing projects. Furthermore, by implementing international best practices in FAIR (Findability, Accessibility, Interoperability and Reusability) Research Data Management (RDM), CLARIAH-VL will pave the way to Flemish participation in the European Open Science Cloud.

Researcher(s)

Research team(s)

CATCH 2020: Computer-Assisted Transcription of Complex Handwriting. 01/05/2018 - 30/04/2021

Abstract

CATCH 2020 aims to provide a working infrastructure for the computer-assisted transcription of complex handwritten documents. It will do so by building on the existing Transkribus platform for Handwritten Text Recognition (HTR) – which allows us to process handwritten textual documents in a way that is similar to how OCR processes printed textual documents.. Rather than producing flat transcripts of digital facsimile images, however, CATCH 2020 will produce structured texts, providing tools to add textual and linguistic dimensions to the transcription by combining the state of the art of the research field of textual scholarship with the state of the art of the research field of computational linguistics.

Researcher(s)

Research team(s)

Al stories: interactive narratives for hospitalised children. 01/01/2018 - 28/02/2022

Abstract

AI Stories is a language technological project based around the paediatrics wards of two Flemish hospitals. The project intends to push forward state of the art natural language generation, in order to implement a system capable of autonomously telling stories, and revealing invaluable psychological data to healthcare staff. In turn, the same technology could be used by companies and sectors outside of healthcare. Such advances will facilitate educational and caring Artificial Intelligence software, and allow companies in those sectors to build and analyse usable systems. The paediatrics ward is an acute example of a socially challenging linguistic context for children, that also exists more generally. By researching there, and developing an intelligent system capable of prolonged dialogue, which incorporates the feedback of healthcare staff, a robust solution across care industries is achievable.

Researcher(s)

Research team(s)

Strengthening digital research at the UP system: digitization of rare periodicals and training in digital humanities. 01/01/2018 - 31/12/2021

Abstract

This TEAM project funded by VLIR-UOS is a collaboration between the University of Antwerp and the University of the Philippines that combines an exchange of DH expertise and training with a specific digitization project of rare Philippine newspapers and magazines. The Universtity of Antwerp's project three promotors are Mike Kestemont, Dirk Van Hulle, and Rocío Ortuño. The project aims to improve the competitiveness of Philippine Humanities research in a globalized world, including the possibilities of student and professional mobility offered by the ASEAN confluence, by training faculty members and students in the field of Digital Humanities. The first and crucial step towards this objective (1) is the digitization of materials and the creation of a freely accessible environment with user friendly search facilities. Several periodicals published before World War II are in a precarious state of preservation and, located in Metro Manila, they are not accessible to all universities in the Philippines. By digitizing these periodicals and hosting them in a freely accessible online repository, they could be made available to all peripheral universities, and used in DH related research. Subsequently, (2) training in DH will be provided at different campuses of the University of the Philippines System. This training fits in the Philippine government's priority for promoting digital literacy both among scholars and the larger public. It also allows the University of the Philippines to participate in the global emergence and collaborative hallmark of DH.

Researcher(s)

Research team(s)

Artificial Hearing: Neural Networks and the Acoustic Identifiability of Children with Cochlear Implants. 01/10/2017 - 30/09/2021

Abstract

Approximately 1 of out of 1,000 neonates is diagnosed with a bilateral severe-to-profound hearing loss. Hearing aids, such as cochlear implants (CI), have opened up unprecedented perspectives for these children. Although CIs generally lead to remarkable gains in the spoken language proficiency of hearing-impaired children, their speech remains deviant from normal hearing children's speech, even after several years of device use. Adult speakers are able to discriminate between the speech of CI-children and that of normally hearing children. In other words, CI children's speech remains identifiable as the speech of a hearing impaired individual. Surprisingly, the exact characteristics on which adults base such decisions have so far remained elusive, which makes it difficult for clinicians to finetune speech rehabilitation programs. In this project, we aim to exploit recent advances in "Deep" Representation Learning to close in on these characteristics. Recent connectionist models (neural networks) have shown a promising performance in modelling raw audio signals, such as recorded speech. Through the careful inspection, visualization and interpretation of such models, we aim to uncover which specific features in the speech of cochlear-implanted children are responsible for the identifiability of their speech production.

Researcher(s)

Research team(s)

Intelligent Neural Systems as InteGrated Heritage Tools (INSIGHT). 15/12/2016 - 31/07/2022

Abstract

The INSIGHT project aims to advance the application of automated algorithms from the field of Artificial Intelligence to support cultural heritage institutions in their effort to keep up with their ongoing annotation initiatives for their expanding digital collections. We will focus on recent advances in Machine Learning, where the application of neural networks (Deep Learning) has recently led to significant breakthroughs, for instance, in the fields of Natural Language Processing and Computer Vision. We will determine how state-of-the-art algorithms can be used to (semi-)automatically catalogue and describe digital objects, especially those for which no, little or incomplete metadata is available. The project focuses on making the digital collections of two federal museum clusters in Brussels ready to be exported to Europeana, i.e. the Royal Museums of Fine  Arts of Belgium and Royal Museums of Art and History.

Researcher(s)

Research team(s)

Big Data of the Past for the Future of Europe (Time Machine). 01/03/2019 - 29/02/2020

Abstract

Europe urgently needs to restore and intensify its engagement with its past. Time Machine will give Europe the technology to strengthen its identity against globalisation, populism and increased social exclusion, by turning its history and cultural heritage into a living resource for co-creating its future. The Large Scale Research Initiative (LSRI) will develop a large-scale digitisation and computing infrastructure mapping millennia of European historical and geographical evolution, transforming kilometres of archives, large collections from museums and libraries, and geohistorical datasets into a distributed digital information system. To succeed, a series of fundamental breakthroughs are targeted in Artificial Intelligence and ICT, making Europe the leader in the extraction and analysis of Big Data of the Past. Time Machine will drive Social Sciences and Humanities toward larger problems, allowing new interpretative models to be built on a superior scale. It will bring a new era of open access to sources, where past and on-going research are open science. This constant flux of knowledge will have a profound effect on education, encouraging reflection on long trends and sharpening critical thinking, and will act as an economic motor for new professions, services and products, impacting key sectors of European economy, including ICT, creative industries and tourism, the development of Smart Cities and land use. The CSA will develop a full LSRI proposal around the Time Machine vision. Detailed roadmaps will be prepared, organised around science and technology, operational principles and infrastructure, exploitation avenues and framework conditions. A dissemination programme aims to further strengthen the rapidly growing ecosystem, currently counting 95 research institutions, most prestigious European cultural heritage associations, large enterprises and innovative SMEs, influential business and civil society associations, and international and national institutional bodies.

Researcher(s)

Research team(s)

Timemachine. 01/10/2017 - 30/09/2020

Abstract

What if you could travel through time as easily as we travel through space? With the Time Machine consortium, we work towards a large-scale FET Flagship project to build a large-scale simulator capable to map more than 2000 years of European history. This big data of the past, a common resource for the future, will trigger pioneering and momentous cultural, economic and social shifts. Understanding the past undoubtedly is a prerequisite for understanding present-day societal challenges and contributes to more inclusive, innovative and reflective societies. Researchers from all over the world are spearheading joint forces within the Time Machine FET Flagship project to reinvigorate the past through one of the most ambitious projects ever on European culture and identity. The fundamental idea of this project is based on Europe's truly unique asset: its long history, its multilingualism and interculturalism.

Researcher(s)

Research team(s)

The measure of Middle Dutch: rhythm and prosody reconstruction for Middle Dutch literature, a data-driven approach 01/10/2017 - 30/09/2019

Abstract

What does it mean when the rhythm of a literary text is called 'snappy' or 'fluid'? And what are the characteristics of literature that is 'easily engraved on one's mind'? The rhythmical qualities of literature are often described on an intuitive basis, while using vague terms. This is especially true for Middle Dutch literature. The many rhymed texts of our literary history's earliest stages frequently receive labels like these. However, it is often unclear what is actually meant by them. With this research, it is my ambition to provide the highly necessary scientific backing to these intuitive – and therefore potentially biased – statements. Contrary to previous research, I will make use of computational techniques to investigate the rhythmical qualities of Middle Dutch literature. Because these techniques are unprejudiced, subjectivity can be ruled out. As a result, we will achieve a precise and understandable notion of the rhythmicities of literary texts. For the first time ever, we will be able to pinpoint precisely the reasons for certain intuitive observations. Also, by not restricting ourselves to the analysis of individual texts, we will compare the rhythms present in different genres of literature. Without losing ourselves in a jungle of vague impressions, we will therefore be able to put our finger on, for example, the rhythmical differences between the famous texts 'Van den vos Reynaerde' and 'Karel ende Elegast'.

Researcher(s)

Research team(s)

AI Poems: Digital Poetry for Hospitalised Children. 01/01/2017 - 31/12/2017

Abstract

The aim of AI Poems is to develop a software and surrounding service to give hospitalised children access to creative language, and utilise this knowledge to offer a concise and multilingual program to pediatric wards in Flanders and abroad. Using the latest Natural Language Processing technology, the software will produce poetic and visually appealing text, controlled by a child's input. The research project will test multiple forms of physical input, so that children with disabilities can use the program, but also to make the project creatively and socially appealing to a child. The project is partnered with University Hospital Antwerp (UZA) and University Hospital Leuven (UZLeuven), where research will be conducted in pediatric wards. The project will be avised and supported by; partners in Belgium, The Ghent Health Psychology Lab (GHP), Experimental Psychology at The Free University of Brussels (EXTO), The Computational Linguistics & Psycholinguistics Research Center at The University of Antwerp (CLiPS); and international partners, Natural Interaction at The University of Madrid, The Creative Language System Group at The University of Dublin, and Apple Computer.

Researcher(s)

Research team(s)

InterStylar: A Stylometric Approach to Intertextuality in 12th century Latin Literature. 01/10/2016 - 30/09/2020

Abstract

In authorship studies, scholars use quantitative techniques from stylometry to attribute anonymous texts to known authors on the basis of writing style. Intertextuality – the phenomenon where authors integrate and/or allude to other texts in their own work – poses an interesting issue here: should all 'intertext', such as citations or allusions, be removed from a text before we can reliably analyze its style? This project challenges the traditional view in stylometry that such 'Fremdkörper' are pure noise and seeks to verify the hypothesis that intertext constitutes a crucial aspect of an author's individual writing style. To this end, we analyze a representative corpus of 12th century Latin literature, circling around the impressive oeuvre of Bernard of Clairvaux, in which (biblical and other) intertextuality plays a dominant role. This project will employ recent advances in 'deep' representation learning which allows to model texts from the character-level upwards.

Researcher(s)

Research team(s)

GIStorical Antwerp II. The historical city as empirical lab for urban studies using high-resolution social maps. 01/05/2016 - 30/04/2020

Abstract

In a time of rapid urbanization solid long-term perspectives on the many environmental, social, economic or political challenges of urbanity are urgently needed. Uniting urban history, sociology, environmental studies and digital humanities, GIStorical Antwerp II turns the historical city into a digital lab which provides an answer to this need. For 8 snapshots between 1584 and 1984 it offers dynamic social maps including every household in the entire city of Antwerp. Construction combines innovative ways of crowd-sourcing and time-efficient spatial and text-mining methodologies (Linear Referencing, Named Entity Recognition). The result is a GIS-environment which not only allows a micro-level view of 500 years of urban development, but more importantly allows an immediate spatial and social contextualization of a sheer unlimited number of other datasets, both those realized through 30 years of research on Antwerp and the mass of structured and unstructured digital 'big data'. For both the applicants and the international research community a completely new type of longitudinal research on urban inequalities – from income over housing quality to pollution – becomes feasible.

Researcher(s)

Research team(s)

Digital textanalysis. 01/12/2015 - 30/11/2020

Abstract

In the Humanities, scholars study the products of the human mind, such as language, paintings, music, etc. Texts too are an important research object across many field in the Humanities. Until recently, most textual analyses in the Humanities were carried out manually by individual experts, via "close reading" or the careful, sustained reading of small sets of works. With the advent of personal computing in recent decades, an increasing number of texts are becoming available in digitized, electronic forms. This allows scholars to analyze much larger text collections via computer programs. "Distant Reading" is nowadays often used to denote such "macro"-forms of digital text analysis in the Humanities. Mining insights from large text collections via Distant Reading proves to be challenging. Therefore, it does not come as a surprise that, as the size of the datasets increases, we see that research methods often become less sophisticated. In many "Big Data" studies, we see for instance that scholars do little more than computing word frequencies across texts. If we want computers to become smarter and be able to read texts like humans can (cf. Artificial Intelligence), we therefore need to develop more complex forms of "Artificial Reading". Interestingly, we see that humans always have highly personal interpretations of texts, because they are influenced by their specific background. Many computer programs read texts in a more generic way and forget about the individual background that human readers have. Universities, as well as many large tech-companies in the world, such as Facebook or Google, are therefore currently developing smarter computer algorithms, which can automatically learn from stimuli in the outside world, just as humans would. These computer programs are called "Deep Learning" in computer science and have proven to be extremely successful in many real-world applications (e.g. face detection in pictures on social media). Surprisingly, these techniques have been rarely applied in Humanities research. The broad aim of this project is to transfer these promising "Deep Learning" methods to digital text analysis.

Researcher(s)

Research team(s)

The Measure of Middle Dutch: Rhythm and Prosody Reconstruction for Middle Dutch Literature, A Data-Driven Approach. 01/10/2015 - 30/09/2017

Abstract

What does it mean when the rhythm of a literary text is called 'snappy' or 'fluid'? And what are the characteristics of literature that is 'easily engraved on one's mind'? The rhythmical qualities of literature are often described on an intuitive basis, while using vague terms. This is especially true for Middle Dutch literature. The many rhymed texts of our literary history's earliest stages frequently receive labels like these. However, it is often unclear what is actually meant by them. With this research, it is my ambition to provide the highly necessary scientific backing to these intuitive – and therefore potentially biased – statements. Contrary to previous research, I will make use of computational techniques to investigate the rhythmical qualities of Middle Dutch literature. Because these techniques are unprejudiced, subjectivity can be ruled out. As a result, we will achieve a precise and understandable notion of the rhythmicities of literary texts. For the first time ever, we will be able to pinpoint precisely the reasons for certain intuitive observations. Also, by not restricting ourselves to the analysis of individual texts, we will compare the rhythms present in different genres of literature. Without losing ourselves in a jungle of vague impressions, we will therefore be able to put our finger on, for example, the rhythmical differences between the famous texts 'Van den vos Reynaerde' and 'Karel ende Elegast'.

Researcher(s)

Research team(s)

Periodization in Literary History: A Computational Model of the History of Dutch Literature. 01/10/2015 - 30/11/2015

Abstract

In literary history, scholars commonly divide the temporal series of events which they are discussing into periods (e.g. Romanticism). This process is called periodization and it is considered an important task of historical literary scholarship. In spite of its present-day relevance, periodization remains a surprisingly controversial process: some of the most influential models in literary history are considered a 19th-century inheritance, of which the present-day validity is often questioned nowadays. The objective of this project is to build a computational model of the history of Dutch-language literature in the Low Countries (13th-20th century). This diachronic model will use techniques from computational text analysis ("Distant Reading") to track changes in the stylistic and thematic characteristics of texts. Importantly, this will be a bottom-up model: it will be created in a data-driven manner, instead of setting out from existing (potentially preconceived) hypotheses. This model will be carefully interpreted and compared to the state of the art in traditional literary scholarship. This will allow us to verify and better understand the validity of established periodization models of Dutch literary history. This project will greatly contribute to the ongoing international debate about the integration of traditional, "close reading" methods in literary studies and new, computational methods for "distant reading".

Researcher(s)

Research team(s)

Digital Humanities. 13/11/2013 - 31/12/2014

Abstract

This project represents a research contract awarded by the University of Antwerp. The supervisor provides the Antwerp University research mentioned in the title of the project under the conditions stipulated by the university.

Researcher(s)

Research team(s)

Expanding the Online Froissart, a resource for the study of late-medieval book production 01/02/2013 - 31/12/2013

Abstract

A lot of scholarly attention has recently been paid to medieval book production. It was complex to produce the voluminous manuscripts that survive from e.g. early fifteenth century Paris. A major scholarly problem involves the scribes of manuscripts: sometimes up to 20 copyists seem to have contributed to a copy, but their handwritings can be extremely difficult to distinguish. Developing objective methodologies to discriminate between these fellow scribes, is therefore an important challenge in medieval studies. In my PhD I have argued for the potential of "stylometric" approaches in this respect. Typically scribes adopted a highly individual spelling profile in such a consequent way that algorithms are often able to automatically detect a scribe's handwriting in a previously unseen copy. Such text-based identifications can be achieved using a combination of quantitative techniques from Computational Linguistics and Artificial Intelligence. The Online Froissart is a valuable digital resource in this respect, presenting a machine-readable edition of many early fifteenth century, Parisian manuscripts of the Chroniques by Jean Froissart. This Small Project targets the focused expansion of the Online Froissart, because it is ideal for research into the text-based recognition of late-medieval scribes. In a variety of ways, the present proposal complements my recently started postdoc project, in which linguistic scribal attributions are a major interest.

Researcher(s)

Research team(s)

A medieval Stylome? Exploring the Universal Stylome Hypothesis in medieval prose. 01/10/2012 - 30/09/2015

Abstract

In this project I will further explore the applicability of the Stylome Hypothesis in medieval literature: 1. I will apply computational stylometry to medieval prose. Because so many (anonymous) medieval prose texts survive, stylometric techniques for authorship attribution in prose are highly relevant. The proposed case study targets religious prose (13th/14th century) from Brabant. 2. Throughout medieval Europe, a lot of Latin literature was produced. I propose to extend my research to Latin, via the original case study of the Flemish monks (11th century) who were attracted by English nobility to write Latin biographies.

Researcher(s)

Research team(s)

The end rhyme in Middle Dutch epic literature (ca. 1200-1500): development and relationship to authorship and genres. 01/10/2010 - 30/09/2012

Abstract

Nearly all of Middle Dutch narrative literature (ca. 1200-1500) was written in rhyming couplets, which is why rhyme words are extremely suitable for the comparative study of Middle Dutch epic texts and authors. My research specifically focuses on three aspects: (a) the evolution of rhyme in the vernacular epic poetry of the medieval Low Countries; (b) the usefulness of rhyme words for authorship verification and attribution; (c) the correlation between rhyme word vocabulary and epic subgenres. My methodology is mainly borrowed from literary stylistics, computational stylometry and computational language technology. As such, this project envisages a quantitative study into the stylistic creativity of Middle Dutch epic poets.

Researcher(s)

Research team(s)

The end rhyme in Middle Dutch epic literature (ca. 1200-1500): development and relationship to authorship and genres. 01/10/2008 - 30/09/2010

Abstract

Nearly all of Middle Dutch narrative literature (ca. 1200-1500) was written in rhyming couplets, which is why rhyme words are extremely suitable for the comparative study of Middle Dutch epic texts and authors. My research specifically focuses on three aspects: (a) the evolution of rhyme in the vernacular epic poetry of the medieval Low Countries; (b) the usefulness of rhyme words for authorship verification and attribution; (c) the correlation between rhyme word vocabulary and epic subgenres. My methodology is mainly borrowed from literary stylistics, computational stylometry and computational language technology. As such, this project envisages a quantitative study into the stylistic creativity of Middle Dutch epic poets.

Researcher(s)

Research team(s)