Manfred Stede is professor of applied computational linguistics at the university of Potsdam
The next lecture in the TLA Lecture Series Digital Humanities in Action (formerly known as Nijmegen e-Humanities) will be given the 17th of October by Manfred Stede. He will present on Multilevel annotation for discourse research: The ANNIS2 linguistic database.
Who: Manfred Stede, Universität Potsdam, Germany
What: Multilevel annotation for discourse research: The ANNIS2 linguistic database
Where: MPI for Psycholinguistics, room 1.63 (main lecture hall)
When: Wednesday 17th of October, 15:30
While sentence-oriented linguistic investigation has for a long time benefited from the availability of annotated corpora (tree banks), the situation for research on the discourse level is less satisfactory: text corpora are expensive to produce, and there is not very much agreement yet on what the appropriate structural descriptions are. A promising framework is the annotation of text on multiple “simple” levels, which can be independently motivated and verified with agreement studies. Multilevel annotations then allow for systematically exploring correlations between the different levels in two different ways: (i) data mining that looks for patterns in a “bottom-up” way, and (ii) query-directed (“top-down”) explorations performed by a user to answer a specific research question. We focus on the latter scenario and consider the inspection of several linguistic issues on the discourse level, in particular correlations between syntax, coreference, and “rhetorical structure” of texts. As the technical tool set for this research, we use a linguistic database (ANNIS2) and an associated software infrastructure, which can be used to query corpora across distinct layers of annotation. After a general introduction to ANNIS2, we show several use cases to demonstrate how the above-mentioned discourse issues can be explored.