Thursday, July 6 • 11:36am - 11:54am
**BradleyTerryScalable**: Ranking items scalably with the Bradley-Terry model

Keywords: Citation data, Directed network, Paired comparisons, Quasi-symmetry, Sparse matrices
Webpage: https://github.com/EllaKaye/BradleyTerryScalable
Motivated by the analysis of large-scale citation networks, we implement the familiar Bradley-Terry model (Zermelo 1929; Bradley and Terry 1952) in such a way that it can be applied, with relatively modest memory and execution-time requirements, to pair-comparison data from networks with large numbers of nodes. This provides a statistically principled method of ranking a large number of objects, based only on paired comparisons.
The BradleyTerryScalable package complements the existing CRAN package BradleyTerry2 (Firth and Turner 2012) by permitting a much larger number of objects to be compared. In contrast to BradleyTerry2, the new BradleyTerryScalable package implements only the simplest, ‘unstructured’ version of the Bradley-Terry model. The new package leverages functionality in the additional R packages igraph (Csardi and Nepusz 2006), Matrix (Bates and Maechler 2017) and Rcpp (Eddelbuettel 2013) to provide flexibility in model specification (whole-network versus disconnected cliques) as well as memory efficiency and speed. The Bayesian approach of Caron and Doucet (2012) is provided as an optional alternative to maximum likelihood, in order to allow whole-network ranking even when the network of paired comparisons is not fully connected.
The BradleyTerryScalable package can readily handle data from directed networks with many thousands of nodes. The use of the Bradley-Terry model to produce a ranking from citation data was originally advocated in Stigler (1994), and was studied in detail more recently in Varin, Cattelan, and Firth (2016); here we will illustrate its use with a large-scale network of inter-company patent citations.
Thursday July 6, 2017 11:36am - 11:54am
4.01 Wild Gallery

