Events – ΔiaLing

Upcoming events

Event Information:

Thu
13
Jun
2024

Giuseppe Magistro (UGent) - "Creating a corpus of web-data with Pyrlato. A demonstration"
2:00 pmLokaal 3.30 - Camelot, Blandijn, Campus Boekentoren
The use of corpora in acoustic analyses has become a standard practice in phonetic phonological research, offering high ecological validity (see e.g. Beckman, 1997; Warner, 2012; Tucker & Mukai, 2023 for a discussion on validity). However, compiling corpora and looking for specific phenomena can be time and resource-consuming. In response to this challenge, we developed a program named Pyrlato, which we aim to demonstrate. Pyrlato is a novel tool designed for creating corpora of real-world spoken data from the web. The tool extracts audio files from YouTube, cutting and extracting desired segments such as specific phonemes, syllables, or words found in YouTube videos. This enables the creation of corpora with tens of thousands of tokens within a few computational hours. Pyrlato works across Dutch, English, French, German, Indonesian, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Turkish, Ukrainian, and Vietnamese, i.e. those languages for which YouTube provides automatic subtitles. The software searches for the desired string in the subtitles and, upon finding the match, extracts the relevant audio extract containing the string in .mp3 format (other formats are also possible).

The demonstration will showcase Pyrlato's online version and the application of some case studies.

• Beckman, M.E. (1997).A typology of spontaneous speech. In Y. Sagisaka, N. Campbell, & N. Higuchi (Eds.), Computing Prosody: Computational Models for Processing Spontaneous Speech (pp. 7–26). Springer. http://dx.doi.org/10.1007/978-1-4612-2258-3_2.
• Tucker, B.V., & Mukai, Y. (2023). Spontaneous speech. Cambridge University Press. http://doi.org/10.1017/9781108943024.
• Warner, N. (2012). Methods for studying spontaneous speech. In A. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford Handbook of Laboratory Phonology (pp. 621–633). Oxford University Press.

Show content

Past events

Event Information:

Tue
06
Feb
2024

Bernat Bardagil-Mas (UGent) – “Language documentation and revitalization of Mỹky, in Brazilian Amazonia”
2:00 pmBlandijn Room 3.30 (Camelot)
This presentation discusses recent language documentation and revitalization work among the Mỹky and Manoki nations in Brazilian southern Amazonia, as well as the revitalization of the traditional jakuli genre of dance and music. Mỹky is an isolated indigenous Amazonian language spoken in north-western Mato Grosso, in Brazil, by two communities: the Manoki (Iranxe) and the Mỹky (Monserrat 2010; Bardagil 2023). The two communities went through different contact periods with colonial Brazilian society, which resulted in different sociolinguistic situations. While the Manoki dialect of the language is severely threatened, with only three members of the community (~400) being able to speak it today, it is spoken as a first language by most of the Mỹky group (~100).
Since 2019 I have been working alongside the Manoki community to preserve and reclaim their indigenous language, which led to the creation of the Watjuho Ja'a Collective and the organization of several intensive schools for Manoki language, and to several initiatives to reinforce musical and spiritual community practices that are closely connected with the orally-transmitted heritage of this threatened indigenous nation.

Show content

Upcoming events

Event Information:

Giuseppe Magistro (UGent) - "Creating a corpus of web-data with Pyrlato. A demonstration"

Past events

Event Information:

Bernat Bardagil-Mas (UGent) – “Language documentation and revitalization of Mỹky, in Brazilian Amazonia”