Computational linguistics

Can we model language understanding, production, learning, and translation with computational models? Computational linguistics research at CLiPS is concerned with the study of computational methods for the representation, acquisition, and use of language knowledge.

We focus on the application of statistical and machine learning methods, trained on corpus data, to explain human language acquisition and processing data, and to develop automatic text analysis systems that are accurate, efficient, and robust enough to be used in practical applications. We develop specific machine learning algorithms suited for the properties of language data (few regularities, many irregularities and exceptions), and develop new methodologies for simulation of these language data.

Our application-oriented research is in the domain of Language Technology: the development of reusable language processing tools to solve concrete problems, and make possible innovative applications. Research focus here has been on text analytics (extracting knowledge from unstructured text data, for example in automatic biomedical text analysis). We develop new approaches combining machine learning and automatic text analysis to solve generic problems in text mining (automatic summarization, question answering, information extraction, smart search, ontology learning, etc.). We build these generic solutions into prototypes for specific applications.

In Digital Humanities, we contribute to the University of Antwerp Digital Humanities platform with research on computational stylometry (authorship attribution, author profiling from text), and language technology for the study of old variants of Dutch.