Tue06Dec20222:00 pmBlandijn FaculteitszaalShow content
Margherita Fantoli (KU Leuven): “Measuring formulaicity in scientific and technical texts”
Thu27Oct201611:00 amHiko b. 001
Parsed Historical Corpora Fest (2/2)Show content
Lecturer: Dr. Joel Wallenberg (Newcastle University)
In the past, the fields of historical linguistics and synchronic syntax have both largely relied on qualitative data, e.g. the analysis of isolated examples, qualitative judgment data, etc. In the last few years, however, successes in variationist sociolinguistics, quantitative biology, and computer science have begun a revolution in the way both syntax and language change are studied: both fields have begun to use more quantitative data, especially in finding theoretically important statistical patterns in naturalistic production data. These fields have also combined with each other and with quantitative methods to give rise to a new field of quantitative diachronic comparative syntax. However, studying syntactic change in this mathematical way, particularly in a cross-linguistic, comparative approach, presents a number of interesting technical challenges. It requires measuring the frequencies of very abstract objects over very large periods of time, and in order to do this, we need a research infrastructure of diachronic parsed corpora (i.e. treebanks) drawn from a number of language histories. Building and analyzing these treebanks requires considerable technical skill, and a fair amount of collaboration between linguists with various computational, theoretical, and philological skills. Our workshops this week will help students with some background in syntax begin to search parsed corpora of this kind, interpret the results, and if they'd like, help them to contribute to the process of building more diachronic corpora of more languages.
Dinsdag, 25 oktober 2016, 14.00u - 16.30u, PC-lokaal D (PlaRoz): over werken met geparsede corpora, bv. PPCHE en IcePaHC
Donderdag, 27 oktober 2016, 11.00u - 13.00u, Hiko b. 001: over het bouwen van een eigen corpus, het parsen van je eigen gegevens