Tools of Machine Learning in Psychometrics
Rudolf Debelak & David Goretzko
Full day short course (9:00am – 5:00pm)
Short Course #2
Participants of this workshop will explore basic concepts and methods of machine learning and deep learning and will learn basic applications and examples in the open source language R. Examples of such applications include the prediction of numerical variables as well as natural language processing.
This workshop will cover four main topics: a) A theoretical overview of some core techniques and concepts of machine learning, b) basic concepts of deep learning and recent architectures such as transformers, c) the implementation of these methods in R packages, and d) natural language processing, text generation and other applications of these methods.
Intended Audience
Participants should already have some basic knowledge about working with R, and should be familiar with basic statistical models, such as linear and logistic regression and decision trees. The target audience of this workshop encompasses undergraduate and graduate students as well as researchers and practitioners in academic and industry roles who are interested in machine learning and related topics such as deep learning and natural language processing, and how they might apply these tools to their own work.
Summary
Machine learning, deep learning and related topics such as pre trained large language models have found numerous applications in academic research and the industry. A wealth of research has discussed possible applications of these methods in psychometrics and psychological testing, including the automated scoring of essays and text generation.
Modern tools of machine learning build on more basic, widely known concepts and models from statistics, such as the linear and logistic regression model. Models of machine learning aim to generalize these well known models in several aspects , such as the modeling of non linear relationships and predictions based on non numerical data, such as texts and pictures.
The resulting flexibility of machine learning methods leads to important challenges, such as the estimation of model parameters, the selection of a suitable model and model architecture for a problem at hand, the assessment of the accuracy of the model predictions, and the explanation of the model prediction. Moreover, there exists a wealth of software packages in R that allow the application of these methods. This workshop aims to guide through the available algorithms, point at practical software implementations in R, and will demonstrate how these techniques can be applied to solve problems in psychology and education. We will also cover central concepts of machine learning, such as the definition of training, validation and test data, and the use of pre-trained models, which are crucial for successful applications of machine learning models.
The purpose of this workshop is threefold:
- We illustrate modern algorithms of machine learning, such as random forests or artificial neural networks, and shed light on how they build on well known, simpler models such as decision trees or linear regression.
 - We demonstrate how simple and advanced machine learning models can be applied how simple and advanced machine learning models can be applied using R software, including advanced methods from the field of natural language from the field of natural language process.
 - We outline the basic ideas of modern applications of machine learning methods, such as natural language processing, and discuss their application in R.
 
The workshop will include theoretical introductions as well as practical examples and exercises in R software. We encourage workshop participants to bring their own laptops with R pre-installed so that they can easily follow these examples. R is available for Windows, Linux and macOS operating systems. Hand-outs and R scripts will be made available before the workshop.
References
- Kjell, O., Giorgi, S., & Schwartz, H. A. (2023). The text package: An R package for analyzing and visualizing human language using natural language processing and transformers. Psychological Methods, 28 (6), 1478 1498. https://doi.org/10.1037/met0000542
 - Pargent, F., Schoedel, R., & Stachl, C. (2023). Best Practices in Supervised Machine Learning: A Tutorial for Psychologists. Advances in Methods and Practices in Psychological Science 6 (3). doi:10.1177/25152459231162559
 - Urban, C. J., & Gates, K. M. (2021). Deep learning: A primer
  for psychologists. Psychological
Methods, 26 (6), 743 773. https://doi.org/10.1037/met0000374 
About the instructors
Rudolf Debelak
  
 Rudolf Debelak
  is a Senior Researcher at the Chair of Psychological Methods,
  Evaluation and Statistics at the University of Zurich,
  Switzerland. His research interests include psychometrics, with a
  focus on item response theory, machine learning, and the
  mathematical and statistical foundations of psychological
  research methods. He has degrees in psychology and mathematics
  and received a PhD in Quantitative Psychology from the University
  of Vienna as well as a Habilitation in Psychological Methods from
  the University of Zurich. His teaching includes basic and
  advanced courses on statistics, data science in R and Python, and
  machine learning. Before working in academia, he was employed in
  the psychological test industry for several years.
David Goretzko
  
 David Goretzko
  is an assistant professor for Methodology and Statistics at
  Utrecht University (UU). He holds degrees in physics, psychology
  and statistics and received both a PhD and Habilitation in
  Psychological Methods from Ludwig-Maximilians-University (LMU) in
  Munich. His research combines psychometrics and latent variable
  modeling with machine learning and meta-heuristics. Beyond this,
  he explores the potential of machine learning and its
  cost-sensitive extensions to address substantive research
  questions in psychology and psychological assessment. His broad
  teaching background includes statistics, data science, machine
  learning and R programming courses in Bachelor, Master and PhD
  programs, summer schools and workshops, as well as statistical
  consulting for PhD students and faculty members at both UU and
  LMU