Bootstrap Validation of the Estimated Parameters in Mixture Models Used for Clustering
When a mixture model is used to perform clustering, the uncertainty is related both to the choice of an optimal model (including the number of clusters) and
to the estimation of the parameters. We discuss here the computation of confidence intervals using different bootstrap approaches, which either mix or separate the two kinds
of uncertainty. In particular, we suggest two new approaches that rely to some degree on the model specification considered as optimal by the researcher, and that address
specifically the uncertainty related to parameter estimation. These methods are especially useful for poorly separated data or complex models, where the selected solution is
difficult to recreate in each bootstrap sample, and they present the advantage of reducing the well-known label-switching issue. Two simulation experiments based on the Hidden
Mixture Transition Distribution model for the clustering of longitudinal data illustrate our proposed bootstrap approaches.