S. Beniamine: xxx
Instructor: Heather Burnett (CNRS/LLF & Labex EFL)
(to appear)
H. Burnett: xxx
Instructor: Heather Burnett (CNRS/LLF & Labex EFL)
(to appear)
N. Levshina: A statistical toolkit for linguistic typology
Instructor: Dr. Natalia Levshina (Radboud University, The Netherlands)
This hands-on course introduces key statistical methods for data exploration and hypothesis testing in linguistic typology. The focus is on practical application, with methods illustrated through case studies based on cross-linguistic databases and corpora. Participants will work with provided R scripts and gain experience applying the techniques to real typological data.
- Session 1. Introduction to regression models for testing cross-linguistic correlations and implicational relationships, while accounting for genealogical and spatial dependencies between languages.
- Session 2. Multidimensional scaling for the construction of proximity-based semantic maps, including cluster identification and interpretation of dimensions.
- Session 3. Conditional inference trees and random forests for investigating and comparing constraints on formal variation across languages.
J. Marsault: Documentation and fieldwork issues - the Umóⁿhoⁿ language
Instructor: Julie Marsault (University of Paris 3 & Lacito)
This class will present issues with data gathering and annotation of an under-described and critically endangered language, Umóⁿhoⁿ (also Omaha ; Siouan, United-States).
- In a first session, I will present the Siouan languages in general and Umóⁿhoⁿ in particular.
- The second session will be dedicated to sharing my fieldwork experience and some social and technical difficulties to preserve and document the Umóⁿhoⁿ language.
- Finally, we will have a hands-on session of annotation of Umóⁿhoⁿ sentences.
A. Miletic-Haddad: xxx
Instructor: Aleksandra Miletic (CNRS/MoDyCo)
(to appear)
P. Muller: Pretrained Language Models for linguistic research
Instructor: Philippe Muller (University of Toulouse, IRIT & GDR TAL)
This class will introduce the fundamentals of recent pretrained language models (PLM) and explore their relationship with traditional linguistic levels of analysis.
We will examine PLM language capabilities and discuss how they can support linguistic research, e.g. through generating data, automated annotations or as experimental models for language performance. In addition, we will address issues and challenges in using or studying these models, notably multilinguality and representativeness.
The class will feature practical lab exercises, some of which can be informed by participants' use cases.
C. Parisse: Create, annotate, and analyze a multimodal corpus of language interaction
Instructor: Christophe Parisse (CNRS/MoDyCo & HumaNum CORLI consortium)
This course will present the methodology and the tools that can be used to create, annotate, and work with a multimodal data corpus which goal is to study natural and spontaneous conversational situations collected in ecological settings.
The course will use examples focusing on the study of language acquisition and situations involving children and parents, but it wiil also apply to any type of language situation.
- Session 1: Review of recording conditions and data collection procedures. Presentation of data available in repositories such as ORTOLANG that can be used for research purposes. Presentation of tools for transcribing data, either manually or semi-automatically.
- Session 2: The theme of this session will be data annotation and the use of appropriate tools. The aim here is no longer simply to produce a transcription, but to become familiar with the tools and principles that enable data enrichment, to study for example phonology, syntax, pragmatics, gesture, and interaction. This session will present in more detail tools such as ELAN, which are particularly well suited to multimodal research, in connection with other tools depending on the research needs.
- Session 3: This session will focus on data analysis. In particular, it will present methods for extracting and analyzing data from ELAN, using as an example the work carried out within the DINLANG project, which examines dinner conversations between children and parents, with analyses focusing on language and gesture.
C. Pozniak: Introduction to psycholinguistics
Instructor: Céline Pozniak (University of Paris 8 & SFL)
This course will introduce you to the principles and methods of psycholinguistics. We will explore how empirical approaches allow us to test hypotheses about language processing, using both offline and online methods.
- Session 1: Introduction. We will explore how empirical evidence shapes psycholinguistic research, from hypothesis testing to experimental design.
- Session 2: Offline methods. We will examine offline methods (e.g., acceptability judgments) and their role in uncovering language structure and processing.
- Session 3: Online methods. We will examine online methods (e.g., eye-tracking) to study real-time language processing.