Taxicab Correspondence Analysis of Ratings and Rankings

  • Vartan Choulakian


Let Y be an IxQ ratings data set, where Q represents the number of items, and I represents the number of rated objects or the number of individuals expressing their opinions on the Q items. This paper considers two kinds of data codings before the application of correspondence analysis (CA) or taxicab correspondence analysis (TCA), where TCA is a L1 variant of CA: the doubled data set YD of size Ix2Q; and the data set Ynega of size Ix(Q+1) where a column named nega is added representing the cumulative complementary columns. The interpretation of maps in CA of YD is based on the lever principle. We use the law of contradiction to interpret maps of CA and TCA of Ynega. We provide necessary and sufficient conditions for TCA of Ynega or YD so that the first factor score is an affine function of the sum score of the ratings; and, if this is true for a dataset, then following Cox we suggest the use of the sum score of ratings either to reduce the Q ratings into a single index, or to summarize the underlying latent variable. This ordinal inference can be of two types: weak or strong. In the case of a rankings dataset, the proposed approach corresponds to Borda count rule or modified Borda count rule. Examples are provided.