
Fall 2018 – Indo-European Studies 280A Phylogenetics
This course applied new computational approaches to an old problem, the relationship between the different Indo-European languages. While it is generally known which languages are Indo-European languages and which sub-languages are closely related, we do not understand fully the language “family tree”. While older methods took a qualitative approach to this problem, recent trends have been to use computers and statistics to examine the problem.
After training through the term by applying computational methods in R to a small dataset, the course culminated in a project for which each student had to produce data describing linguistic features of 24 different Indo-European languages and employ computational methods to examine the potential implications of that data. My task was to examine negation in the languages.
The Rstudio presentation handout (containing R code and a LaTeX writeup) and the project writeup can be downloaded below. Please be aware that R visualizations behave differently on Windows and Mac.
- Rstudio handout and data (GitHub)
- Project presentation handout (.pdf)
- Project presentation bibliography (.pdf)
- Project writeup (updated Sept. 21, 2020, .pdf)
Skills: Programming (R), Computational Linguistics, Data Visualization (R), Typesetting (LaTeX)