Adrian Quintero, ICFES – Colombian Institute for Educational Evaluation
Selecting the number of factors in Bayesian factor analysis
When implementing factor analysis, the selection of the number of factors is challenging in both frequentist and Bayesian approaches. The validity of the likelihood ratio test (LRT) in the frequentist setting strongly depends on the assumption that the factor loadings matrix is of full rank. However, such is not the case when fitting models with more latent components than the true (unknown) number of underlying factors. This invalidates the regularity conditions necessary for the LRT, and the method retains too many factors in practice. Information criteria such as AIC and BIC may also be affected by the regularity conditions. On the other hand, conventional Bayesian methods present two serious drawbacks. Firstly, implementation of the procedures is highly computationally demanding, and secondly, the ordering of the outcomes influences the results since a lower triangular structure is generally assumed for the factor loadings matrix. Therefore, we propose a Bayesian method without imposing the lower triangular structure to overcome ordering dependence. Our approach considers a relatively large number of factors and includes auxiliary multiplicative parameters which may render null the unnecessary columns in the factor loadings matrix. The underlying dimensionality is then inferred based on the number of non-null columns in the factor loadings matrix. We show that implementation of our approach is simple via an efficient Gibbs algorithm. The advantages of the method in selecting the correct dimensionality are illustrated via simulations and using standardized tests from ICFES, the Colombian Institute for Educational Evaluation.
about the speaker
Adrian Quintero works as a researcher in the Department of Statistics at ICFES, the Colombian Institute for Educational Evaluation. He obtained his PhD in Biomedical Sciences at KU Leuven, where he developed extensions of Bayesian hierarchical models with applications in medical research. His research interests include model selection techniques, factor analysis, multilevel models and Bayesian methods in general. Currently, he focusses on computer adaptive testing (CAT), assessing dimensionality in factor analysis and verifying assumptions in standardized tests using Three Parameter Logistic (3PL) models.