Events

Upcoming events

Event Information:

  • Thu
    13
    Jun
    2024

    Giuseppe Magistro (UGent) - "Creating a corpus of web-data with Pyrlato. A demonstration"

    2:00 pmLokaal 3.30 - Camelot, Blandijn, Campus Boekentoren

    The use of corpora in acoustic analyses has become a standard practice in phonetic phonological research, offering high ecological validity (see e.g. Beckman, 1997; Warner, 2012; Tucker & Mukai, 2023 for a discussion on validity). However, compiling corpora and looking for specific phenomena can be time and resource-consuming. In response to this challenge, we developed a program named Pyrlato, which we aim to demonstrate. Pyrlato is a novel tool designed for creating corpora of real-world spoken data from the web. The tool extracts audio files from YouTube, cutting and extracting desired segments such as specific phonemes, syllables, or words found in YouTube videos. This enables the creation of corpora with tens of thousands of tokens within a few computational hours. Pyrlato works across Dutch, English, French, German, Indonesian, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Turkish, Ukrainian, and Vietnamese, i.e. those languages for which YouTube provides automatic subtitles. The software searches for the desired string in the subtitles and, upon finding the match, extracts the relevant audio extract containing the string in .mp3 format (other formats are also possible).

    The demonstration will showcase Pyrlato's online version and the application of some case studies.

    • Beckman, M.E. (1997).A typology of spontaneous speech. In Y. Sagisaka, N. Campbell, & N. Higuchi (Eds.), Computing Prosody: Computational Models for Processing Spontaneous Speech (pp. 7–26). Springer. http://dx.doi.org/10.1007/978-1-4612-2258-3_2.
    • Tucker, B.V., & Mukai, Y. (2023). Spontaneous speech. Cambridge University Press. http://doi.org/10.1017/9781108943024.
    • Warner, N. (2012). Methods for studying spontaneous speech. In A. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford Handbook of Laboratory Phonology (pp. 621–633). Oxford University Press.

     

    Show content

 

Past events

Event Information:

  • Thu
    01
    Dec
    2016
    Fri
    02
    Dec
    2016

    Varieties of Post-classical and Byzantine Greek

    Koninklijke Academie voor Nederlandse Taal- & Letterkunde (KANTL)

    For a large part of the twentieth century, linguistic variation has received little attention. With the work of William Labov and others, however, heterogeneity in language again became a topic of interest: within the newly founded discipline of sociolinguistics, scholars have investigated the correlationship between linguistic variants and contextual variables such as age, gender, social class, social distance, etc. In actual language use, however, variants (and to some extent, variables) do not occur in an isolated fashion; rather there is patterned heterogeneity. In this spirit, scholars have described the existence of various lects such as chronolects, dialects, idiolects, ethnolects, genderlects, regiolects, sociolects, technolects, etc. in a great number of languages.

    The aim of this conference is to investigate varieties of Post-Classical and Byzantine Greek, a topic of considerable interest among various members of the Greek section at Ghent University. Whereas some research has been done in this area, aspecially when it comes to Post-Classical Greek (e.g. Janse 2007 on New Testament Greek, Horrocks 2007 on levels of writing, Torallas-Tovar 2010 on Greek in Egypt, Nachergaele 2015 on idiolect, Bentein 2015 on register), a more systematic discussion of these varieties has yet to take place – despite the great potential of our Post-Classical and Byzantine sources.

    The organisers invite all Greek linguists to submit a one-page English abstract to varieties@ugent.be (please use a Unicode-based font for Greek text) by September 1, 2016 at the latest. Notification of acceptance will be given by the end of September. Next to the discussion of specific varieties, we consider the following issues of particular interest:

    • What linguistic models can be used for the description and analysis of varieties?
    • What is the relationship between different dimensions of variation, for example between the diachronic and the diastratic dimension?
    • What role do idiolects play for the description of language variation?
    • To what extent do non-congruent features (i.e. features belonging to different, or even opposed varieties) occur in texts?
    • What is the relevance of and relationship between documentary and literary texts as sources of variety?
    • At which linguistic levels (phonological, morphological, syntactic, lexical) can varieties be described?

    Organizing Committee

    • Klaas Bentein (UGent)
    • Willy Clarysse (KU Leuven)
    • Mark Janse (UGent)
    • Bruno Rochette (ULg)

    Keynote Speakers

    • Geoff Horrocks (University of Cambridge)
    • Martti Leiwo (University of Helsinki)

     

    More info on the conference website.

    Show content