Method of forward variable selection based on deviance for Bradley-Terry
models using pairwise ranking data. The selection procedure consists of two steps,
first, permuting the variables from the original predictors
with
n.iteractions
, then performing a forward selection to retain the
predictors
with highest contribution to the model, see details.
btpermute(
contests = NULL,
predictors = NULL,
n.iterations = 15,
seed = NULL,
...
)
a data frame with pairwise binary contests with these variables
'id','player1','player2','win1','win2'; in that order. The id should be equivalent
to the index of each row in predictors
a data frame with player-specific variables with row indices that
should match with the ids in contests
. An id is not required, only the
predictor variables, the ids are the index for each row
integer, number of iterations to compute
integer, the seed for random number generation. If NULL (the default), gosset will set the seed randomly
additional arguments passed to BradleyTerry2 methods
an object of class gosset_btpermute
with the final BTm()
model,
selected variables, seeds (random numbers) used for permutations and deviances
The selection procedure consists of two steps. In the first step, btpermute
adds to the set of original (candidate) predictors
variables
an additional set of 'fake', permuted variables. This set of permuted
predictors
is created
by assigning to each ranking the variables from another, randomly selected
ranking. The permuted variables are not expected to have any predictive
power for pairwise rankings. In the second step, btpermute
adds
predictors to the Bradley-Terry model in a forward selection procedure.
Each predictors
(real and permuted) is added to the null model
individually, and btpermute
retains which variable reduces model
deviance most strongly. The two-step process is replicated n
times
with argument n.iterations
. At each iteration, a new random permutation
is generated and all variables are tested. Replicability can be controlled
using argument seed
. Across the n n.iterations
, the function
identifies the predictor that appeared most often as the most deviance-reducing
one. When this is a real variable, it is constantly added to the model and
the forward selection procedure moves on – again creating new permutations,
adding real and fake variables individually, and examining model deviance.
Variable selection stops when a permuted variable is found to be most
frequently the most deviance-reducing predictors
across n.iterations
.
In turn, variable selection continuous as long as any real variable has stronger
explanatory power for pairwise rankings than the random variables.
Lysen, S. (2009) Permuted inclusion criterion: A variable selection technique. University of Pennsylvania
Other model selection functions:
crossvalidation()
if (FALSE) { # interactive()
require("BradleyTerry2")
data("kenyachoice", package = "gosset")
mod <- btpermute(contests = kenyachoice$contests,
predictors = kenyachoice$predictors,
n.iterations = 10,
seed = 1)
mod
}