The next lecture in the TLA Lecture Series Digital Humanities in Action (formerly known as Nijmegen e-Humanities) will be given the 14th of November by Chris Biemann. He will present on Structure Discovery in Natural Language – unsupervised language-independent methods.


Who: Chris Biemann

What: Structure Discovery in Natural Language – unsupervised language-independent methods
Where: MPI for Psycholinguistics, room 1.63 (main lecture hall)
When: Wednesday the 14th of November, 14:30



In this seminar, I will talk about the Structure Discovery Paradigm, which
is a framework to find regularities in text material of an arbitrary
language, and to make these explicit in the data to be used in further
processing. After recapitulating work on language separation, unsupervised
part-of-speech tagging and word sense induction, the concept of
two-dimensional text is introduced, which can be utilized for semantic
matching for text similarity and word sense disambiguation.
These unsupervised, knowledge-free methods are especially valuable in
situations where one does not have NLP components for the target language
or domain, yet enough unlabeled data to induce the regularities.