Events

Upcoming events

Event Information:

  • Thu
    13
    Jun
    2024

    Giuseppe Magistro (UGent) - "Creating a corpus of web-data with Pyrlato. A demonstration"

    2:00 pmLokaal 3.30 - Camelot, Blandijn, Campus Boekentoren

    The use of corpora in acoustic analyses has become a standard practice in phonetic phonological research, offering high ecological validity (see e.g. Beckman, 1997; Warner, 2012; Tucker & Mukai, 2023 for a discussion on validity). However, compiling corpora and looking for specific phenomena can be time and resource-consuming. In response to this challenge, we developed a program named Pyrlato, which we aim to demonstrate. Pyrlato is a novel tool designed for creating corpora of real-world spoken data from the web. The tool extracts audio files from YouTube, cutting and extracting desired segments such as specific phonemes, syllables, or words found in YouTube videos. This enables the creation of corpora with tens of thousands of tokens within a few computational hours. Pyrlato works across Dutch, English, French, German, Indonesian, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Turkish, Ukrainian, and Vietnamese, i.e. those languages for which YouTube provides automatic subtitles. The software searches for the desired string in the subtitles and, upon finding the match, extracts the relevant audio extract containing the string in .mp3 format (other formats are also possible).

    The demonstration will showcase Pyrlato's online version and the application of some case studies.

    • Beckman, M.E. (1997).A typology of spontaneous speech. In Y. Sagisaka, N. Campbell, & N. Higuchi (Eds.), Computing Prosody: Computational Models for Processing Spontaneous Speech (pp. 7–26). Springer. http://dx.doi.org/10.1007/978-1-4612-2258-3_2.
    • Tucker, B.V., & Mukai, Y. (2023). Spontaneous speech. Cambridge University Press. http://doi.org/10.1017/9781108943024.
    • Warner, N. (2012). Methods for studying spontaneous speech. In A. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford Handbook of Laboratory Phonology (pp. 621–633). Oxford University Press.

     

    Show content

 

Past events

Event Information:

  • Fri
    18
    Sep
    2015

    The Language(s) of the Papyrus Archives

    10:00 amKoninklijke Academie voor Nederlandse Taal- & Letterkunde (KANTL)

    The conference “The Language(s) of the Papyrus Archives” will be held at the conference center KANTL (Koningstraat, Ghent) on 18 September 2015.
    The conference focuses on the linguistic approach of (documentary) papyri, and on papyri preserved in papyrological archives in particular.
    It has a multilingual scope with contributions about Greek, Demotic and Latin.
    It not only presents scientific results of scholars working within the field of the linguistic study of papyri, but it also aims to discuss the potential of this relatively new subdiscipline.

    Programme
    10.00-10.30       Coffee
    10.30-10.45       Welcoming speech (Delphine Nachtergaele, Ghent University)
    10.45-11.30       Bilingual documents in Greco-Roman Egypt (Willy Clarysse, KU Leuven)
    11.30-12.15       Some Aspects of the Language of Individuals and Social Groups in Zenon Papyri (Trevor Evans, Macquarie University, Sydney)
    12.15-13.30       Lunch
    13.30-14.15       Further observations on senders, scribes and language (Hilla Halla-aho, University of Helsinki)
    14.15-15.00       DIA in the Archive of Basil the Pagarch (VIII AD) (Klaas Bentein, Ghent University)
    15.00-15.45       Digital Humanities and the future of linguistic papyrology (Mark Depauw, KU Leuven)

    Registration
    Attendance of the conference is open to the public. A fee of €25 is asked for coffee & tea and lunch; therefore advance registration is necessary. If you plan to attend, please inform the organizers (Delphine.Nachtergaele@UGent.be)

    Show content