Recent Contributions of Data Mining to Language Learning Research

Mark Warschauer, Soobin Yim*, Hansol Lee, Binbin Zheng

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

13 Scopus citations


This paper will review the role of data mining in research on second language learning. Following a general introduction to the topic, three areas of data mining research will be summarized - clustering techniques, text-mining, and social network analysis - with examples from both the broader field and studies conducted by the authors. The application of data mining in second language learning research is relatively new, and more theoretical and empirical support is needed in the appropriate collection, use, and interpretation of data for specific research and pedagogical objectives. The three examples that we introduce illustrate how new data sources accessible in online environments can be analyzed to better understand the optimal instructional context for corpus-based vocabulary learning (clustering technique), characteristics and patterns of collaborative written interaction using Google Docs (text mining and visualizations), and issues of access and community in computer-mediated discussion (social network analysis). Implications of these new techniques for L2 research will be discussed.

Original languageEnglish
Pages (from-to)93-112
Number of pages20
JournalAnnual Review of Applied Linguistics
StatePublished - 2019
Externally publishedYes


Dive into the research topics of 'Recent Contributions of Data Mining to Language Learning Research'. Together they form a unique fingerprint.

Cite this