Deze cursusinformatie geeft aan hoe het onderwijs zal verlopen bij pandemieniveau code geel en groen.
Als er tijdens het academiejaar aangepast wordt naar code oranje of rood, zijn er wijzigingen mogelijk o.a. in de gebruikte werk - en evaluatievormen.

Text as Data

Course Code :2001FLWDTA
Study domain:Linguistics and Proficiency
Academic year:2020-2021
Semester:1st semester
Contact hours:12
Study load (hours):84
Contract restrictions: No contract restriction
Language of instruction:English
Exam period:exam in the 1st semester
Lecturer(s)Dirk Van Hulle
Wout Dillen

3. Course contents *

In this course, the students will familiarize themselves with structured, unstructured, and semistructured data formats. They will learn more about the concept of markup, and why XML is currently used as the standard meta language for text annotation. The students will learn how to write their own TEI-compliant XML, and how pre-existing XML datasets can be queried, transformed, and maniuplated using XPath and XQuery. This will give them a first practical introduction to the skills and technologies that are used in the development of digital scholarly editions. 

Course Overview:

  • Class 1: Data Formats and Markup
  • Class 2: Unstructured, Structured, and Semistructured Data
  • Class 3: XML Technologies and the Text Encoding Initiative
  • Class 4: Querying XML Data with XPath
  • Class 5: Manipulating XML Data with XQuery