Informative prior distributions for a binomial model to predict professional tennis results

  • Pierre Colin
  • Aurélien Bechler


Tennis is a sport, as many others, that appears to be quite simple in the type of results (victory of one of the two players) but rather quite complex in factors that leads to this binary outcome. The perpetual evolution and increase of the way to collect data leads to more and more accurate available information about professional tennis matches. We studied the predictive properties of the binomial model representing the victory of one player against the other. Bayesian framework enables the updating of an informative prior distribution on the probability of winning (Beta distribution) by the collected information. After model calibration on the years 2011-2012, we test on the result 2013 of the ATP tour three methodologies for the choice of prior. The two firsts are based on latent variable models (Elo and Bradley-Terry). The third one is a point-by-point game simulation method based on the MatchFact statistics of the ATP. Each method is separated in two steps: specify the mean of the a priori distribution based on gathered data, and then its variance according to predictive characteristics. The second part of this article deals with possible uses of these methods for match result predictions, for whole tournament simulations or to propose a new ranking system for professional tennis players.