David Kaplan, University of Wisconsin, Madison
David Kaplan is the Patricia Busk Professor of Quantitative Methods in the Department of Educational Psychology at the University of Wisconsin – Madison. Dr. Kaplan holds affiliate appointments in the University of Wisconsin’s Department of Population Health Sciences and the Center for Demography and Ecology, and is also an Honorary Research Fellow in the Department of Education at the University of Oxford. Dr. Kaplan is an elected member of the National Academy of Education, a recipient of the Humboldt Research Award, a fellow of the American Psychological Association (Division 5), a fellow of the German Institute for International Educational Research, and was a Jeanne Griffith Fellow at the National Center for Education Statistics. Dr. Kaplan’s program of research focuses on the development Bayesian statistical methods for education research. His work on these topics is directed toward applications to large-scale cross-sectional and longitudinal survey designs. Dr. Kaplan received his Ph.D. in education from UCLA in 1987.
An Overview of Recent Developments and Applications of Bayesian Model Averaging
A key characteristic of Bayesian statistical inference that separates it from its frequentist counterpart is its focus on characterizing uncertainty in model parameters - encoding that uncertainty through the specification of prior probability distributions on all model parameters. From a Bayesian point of view however, parameters are not the only quantities that contain uncertainty.
Specifically, the selection of a particular model from a universe of possible models can also be characterized as a problem of uncertainty (Raftery, et al., 1997). The method of Bayesian model averaging quantifies model uncertainty by recognizing that not all models are equally good from a predictive point of view. Rather than choosing one model and assuming that the chosen model is the one that generated the data Bayesian model averaging obtains a weighted combination of the parameters of a (smaller) subset of possible models, weighted by each models’ posterior model probability. Using the weighted parameters rather than the parameters of any particular sub-model is known to provide superior predictive performance according to a particular type of scoring rule. This talk provides an overview of Bayesian model averaging with a focus on recent developments and applications to propensity score analysis, missing data, and longitudinal growth modeling.
Matthias von Davier, National Board of Medical Examiners
Matthias von Davier is Distinguished Research Scientist at the National Board of Medical Examiners (NBME), in Philadelphia, Pennsylvania. Until 2016, he was a senior research director in the Research & Development Division at Educational Testing Service (ETS), and co-director of the center for Global Assessment at ETS, leading psychometric research and operations of the center. He earned his Ph.D. at the University of Kiel, Germany, in 1996, specializing in psychometrics. In the Center for Advanced Assessment at NBME, he works on psychometric methodologies for analyzing data from technology-based high-stakes assessments. He was one of the founding editors of the Springer journal Large Scale Assessments in Education, which is jointly published by the International Association for the Evaluation of Educational Achievement (IEA) and ETS. He is also editor-in-chief of the British Journal of Mathematical and Statistical Psychology (BJMSP), and co-editor of the Springer book series Methodology of Educational Measurement and Assessment. Dr. von Davier received the 2006 ETS Research Scientist award, the 2012 NCME Brad Hanson Award for contributions to educational measurement, and the 2017 AERA division D award for significant methodological contributions to educational research. His areas of expertise includes topics such as item response theory, latent class analysis, diagnostic classification models, and, more broadly, classification and mixture distribution models, computational statistics, person-fit, item-fit, and model checking, hierarchical extension of models for categorical data analysis, and the analytical methodologies used in large scale educational surveys.
Diagnosing Diagnostic Models: From von Neumann's Elephant to Model Equivalencies and Network Psychometrics
This talk critically reviews how diagnostic models have been conceptualized and how they compare to other approaches used in educational measurement. In particular, certain assumptions that have been taken for granted and used as defining characteristics of diagnostic models are reviewed and it is questioned whether these assumptions are the reason why these models have not had the success in operational analyses and large scale applications, contrary to what scholars working on these models may have hoped. The talk discusses how diagnostic models can be formally studied, how equivalencies can be identified, and how limitations of diagnostic models may be overcome by incorporating features of related modeling approaches.
Gerhard Tutz, Ludwig-Maximilians-Universität, München
Gerhard Tutz held positions at the Technical University Berlin (Chair of Statistics and Mathematics) and the Ludwig-Maximilians-University Munich (Chair of Applied Stochastics). His research interests comprise the modelling of categorical data, latent trait models and psychometrics, survival and discriminant analysis. He wrote various articles and several books, including Regression for Categorical Data (2012, Cambridge University Press) and Multivariate Statistical Modelling Based on Generalized Linear Models (2001, Springer, with Ludwig Fahrmeir).
Response Styles and Dispersion in Regression and Item Response Models
The presence of response styles can have a strong impact on the estimation of parameters in regression and latent trait models. When ignored one may obtain biased estimates that yield invalid inference tools and lead to wrong conclusions. Approaches to account for response styles which are in common use are finite mixture models and the more recently proposed item response tree models.
An alternative approach that is propagated in this talk uses the explicit incorporation of response styles as parameters and effects of explanatory variables in the predictor structure of latent trait models. We consider in particular the noncontingent response style (NCR), which is found if persons have a tendency to respond to items carelessly, randomly, or nonpurposefully and extreme response styles (ERS), which is found if persons have a tendency to middle or extreme categories. The noncontingent response style can be generated by unobserved dispersion heterogeneity, which is most often ignored in regression and latent trait modelling. If present, dispersion heterogeneity can yield a specific form of differential item functioning with items seeming to be harder in groups that show stronger dispersion. Models for the inclusion of response styles in Rasch family models are presented together with estimation procedures which are based on mixed models. The models aim at modelling the overall heterogeneity of response styles in the population and the uncovering of subject–specific variables that determine the response styles.
State of the Art Speakers
Miriam Moerbeek, University of Utrecht
Mirjam Moerbeek holds a MSc in biometrics (cum laude) and a PhD in statistics. She is currently employed as an associate professor in statistics at Utrecht University, the Netherlands. Her research interests are statistical power analysis and optimal experimental design, in particular for trials with multilevel data, such as cluster randomized trials and multisite trials. She also studies the optimal design of longitudinal research, with a focus on survival outcomes. She has published 60 peer-reviewed journal papers, a book and software on power analysis of trials with multilevel data and another book on multilevel analysis. Dr Moerbeek has obtained prestigious research grants from the Netherlands Organisation for Scientific Research, and grants to hire PhD students. She is currently involved in projects on Bayesian power analysis and sample size determination, and adaptive research design in survey methodology. Dr Moerbeek is the general secretary of the European Association of Methodology and was involved in the organization of the biennial conference of this association (as chair in 2014). In addition to that she was organizer of the International Conference on Multilevel Modelling and conferences of the International Society of Clinical Biostatistics.
Optimal designs and statistical power analysis of studies with multilevel data
In many studies in the social and behavioral sciences the data have a nested structure with subjects nested within clusters, such as pupils in school classes and patients within general practices. Such a data structure has implications for an a priori power analysis. As several combinations of number of clusters and cluster size may result in the same power level, a criterion should be chosen to select the optimal design. For instance, one may select the design that has lowest costs. This presentation starts with the basics of statistical power analysis and optimal design. The main focus is on experimental studies, but the same techniques can also be applied in other types of research. Several common designs with nested data structures are further discussed and compared to each other, among others the cluster randomized trial and the multisite trial.
The remainder of the presentation is on current optimal design issues: lacking prior information, limited number of clusters and outcomes at individual and cluster level. Optimal designs of trials with nested data are locally optimal: they depend on the size of the intraclass correlation coefficient. The value of this model parameter is often not known in the design stage of a trial, while an incorrect prior estimate may result in a loss of design efficiency. This problem may be overcome by using a maximin design, which requests an a priori range, rather than a point estimate, of the intraclass correlation coefficient to be specified. In practice, the number of clusters is often limited, which may result in an underpowered study. Several means to increase power will be discussed: the use of covariates, taking repeat observations and novel designs, such as the stepped wedge design. In some trials, outcome measures are not only collected at the individual level, but also at the cluster level. A design that is optimal for individual level outcomes is not necessarily the best for a cluster level outcome. Multiple objective optimal design methodology is used to find the design that does reasonably well for both outcome measures.
Peter C. M. Molenaar, Penn State University
Peter C. M. Molenaar graduated in 1976 at Utrecht University in the Netherlands in mathematical psychology and psychophysiology. He started working as assistant professor at the Department of Developmental Psychology of the University of Amsterdam, working his way up to department head of the Department of Psychological Methods. Since 2005 he has been a professor (since 2011 distinguished) at the Pennsylvania State University. The overarching theme of his work is the application of mathematical theories to solve substantial psychological issues. Particular examples include: a) Application of mathematical singularity theory (in particular catastrophe theory) to solve the longstanding debate about the reality of developmental stage transitions; b) Application of nonlinear multivariate statistical signal analysis techniques to solve the problem of mapping theoretical models of cognitive information-processing onto dynamically interacting EEG/MEG and fMRI neural sources embedded; c) Application of mathematical-statistical ergodic theory to study the relationships between intra-individual (idiographic) analyses and inter-individual (nomothetic) analyses of psychological processes; d) Application of advanced multivariate analysis techniques in quantitative genetics and developmental psychology; e) Application of adaptive resonance theory (ART neural networks) to study the effects of nonlinear epigenetic processes; and f) Application of engineering control techniques to optimally guide psychological and disease processes of individual subjects in real time. His CV contains more than 200 papers and several edited books.
Equivalent Dynamic Models
Four distinct forms of equivalent dynamic models are addressed. First a special kind of transformation in general dynamic factor models with lagged factor loadings is highlighted. This transformation already applies to dynamic 1-factor models and can be used for several purposes, including “rotation” to state space form. Some empirical examples of the latter “rotation” are given in which the general dynamic factor model with lagged loadings is shown to outperform the state space model. Second, a few comments will be made about the Houdini transformation in which its relation with a general equivalence transformation from state space models to transfer models is emphasized. It is argued that the Houdini transformation is the source of a set of new equivalent models for structural equation models with latent variables and raises possibly interesting questions about nested models and the true dimension of latent spaces. Third, a few preliminary remarks are made about equivalence transformations of nonlinear state space models. These transformations are diffeomorphisms of which the standard linear variant (factor rotation) is a special case. Some possibly interesting applications are suggested. Fourth, “rotation” of linear vector autoregressive (VAR) models is discussed. An initial distinction is made between standard VARs and equivalent structural VARs. Then “rotation” of structural VARs is considered, yielding an uncountable infinitity of equivalent structural VARs. Next, a new type of equivalent VAR, called the hybrid VAR, is introduced. The implications of the existence of equivalent standard VARs, structural VARs and hybrid VARs for Granger causality testing is discussed.
James Ramsay, McGill University
Jim Ramsay is a Professor Emeritus of Psychology and an Associate Member in the Department of Mathematics and Statistics at McGill University. He received a Ph.D. from Princeton University in 1966 in quantitative psychology. He served as chair of the Department from 1986-1989. Jim has contributed research on various topics in psychometrics, including multidimensional scaling and test theory. His current research focus is on functional data analysis, and involves developing methods for analyzing samples of curves and images. The identification of systems of differential equations from noisy data plays an important role in this work. He is also developing nonparametric methods for the optimal scoring of examinations and psychological scales. He has been President of the Psychometric Society and the Statistical Society of Canada. He received the Gold Medal of the Statistical Society of Canada in 1998 and the Award for Technical or Scientific Contributions to the Field of Educational Measurement of the U. S. National Council on Measurement in Education in 2003. He is an Honorary Member of the Statistical Society of Canada and a Fellow of the American Statistical Association.
Watching children grow taught me all I know
We see more and more data where a single observation is a set of measurements that can be considered as defining a smooth function that underlies the data. Sets of measurements on the heights of children, whether over all of childhood and adolescence or over the first days of life, can be viewed as defining samples of growth curves. What makes functional observations like these unique among statistical data is the possibility of estimating derivatives, such as height velocity and acceleration. The relationships among these derivatives curves have revealed some astonishing structure and provided clues to growth dynamics. Moreover, these data highlight the fact that time itself is an elastic medium, with each child’s physiological or growth time having interesting nonlinear relationships to the clock times at which the data are recorded. The growth of children will be used in this talk to introduce a wide range of applications of functional data analysis.