Establishing patterns of linguistic variation traditionally requires extensive fieldwork, whether this is for a sociolinguistic study of a single community or a dialect atlas covering a whole region. Social media potentially provide large amounts of data from many more individuals than we could hope to interview using traditional means. In this talk, I will introduce the Tweetolectology project, which aims to establish the geographic and social distribution of linguistic variants using the vast amount of data available in Twitter messages. We will look at some examples of linguistic variation from Welsh, British and Irish English, and Haitian Creole, seeing how tweets demonstrate both well-established patterns of variation and some newly emerging ones, potentially demonstrating linguistic change in progress and diffusion of new variants.
You can re-watch the lecture here:
David Willis is the Jesus Professor of Celtic at the University of Oxford and a professorial fellow at Jesus College. His main area of research is syntactic change, particularly in Celtic and Slavonic languages, including word order change, the development of markers of negation and changes in patterns of agreement marking within the noun phrase. His work in syntactic variation and change uses techniques from dialect syntax, geospatial linguistics and Geographic Information Systems (GIS). He recently led the ESRC-funded Tweetolectology project, using social media (Twitter) as a data source for mapping variation and change across a range of European languages (British English, Welsh, Scandinavian, Turkish, South Slavic etc.), comparing this with more traditional approaches from corpora (e.g. the British National Corpus 2014). Other previous research projects include the Historical Corpus of the Welsh Language, the Syntactic Atlas of Welsh Dialects, and the History of Negation in the Languages of European and the Mediterranean. Recently, he started a project with colleagues at HU Berlin on the history of subject pronouns in the languages of northern Europe.