Kendall rank correlation coefficient

Compute Kendall rank correlation coefficient between two objects. Kendall is a coefficient used in statistics to measure the ordinal association between two measured quantities. A tau test is a non-parametric hypothesis test for statistical dependence based on the tau coefficient. The 'kendallTau' function applies the "kendall" method from 'stats::cor' with some previous treatment in the data, such as converting floating numbers into ranks (from the higher being the first and negative being the last) and the possibility to remove zeros from incomplete ranks

Perform a pairwise permutation test to assess statistical differences in Kendall's Tau correlation between two or more groups.

kendallTau(x, y, null.rm = TRUE, average = TRUE, na.omit = FALSE, ...)

# Default S3 method
kendallTau(x, y, null.rm = TRUE, ...)

# S3 method for class 'matrix'
kendallTau(x, y, null.rm = TRUE, average = TRUE, na.omit = FALSE, ...)

# S3 method for class 'rankings'
kendallTau(x, y, ...)

# S3 method for class 'grouped_rankings'
kendallTau(x, y, ...)

# S3 method for class 'paircomp'
kendallTau(x, y, ...)

kendallTau_bootstrap(x, y, nboot = 100, seed = NULL, ...)

kendallTau_permute(x, y, split, n.permutations = 500)

Arguments

x: a numeric vector, matrix or data frame
y: a vector, matrix or data frame with compatible dimensions to x
null.rm: logical, to remove zeros from x and y
average: logical, if FALSE returns the kendall and N-effective for each entry
na.omit: logical, if TRUE ignores entries with kendall = NA when computing the average
...: further arguments affecting the Kendall tau produced. See details
nboot: integer, the size of the bootstrap sample
seed: integer, the seed for random number generation. If NULL (the default), gosset will set the seed randomly
split: a vector indicating the splitting rule for the test
n.permutations: an integer, the number of permutations to perform

Value

The Kendall correlation coefficient and the Effective N, which is the equivalent N needed if all items were compared to all items. Used for significance testing.

A data.frame containing:

observed_diff: observed absolute differences in Kendall's tau for all group pairs.
p_values: p-values from the permutation test for all group pairs.

References

Kendall M. G. (1938). Biometrika, 30(1–2), 81–93. doi:10.1093/biomet/30.1-2.81

Author

Kauê de Sousa and Jacob van Etten

Kauê de Sousa

Examples


# Vector based example same as stats::cor(x, y, method = "kendall")
# but showing N-effective
x = c(1, 2, 3, 4, 5)

y = c(1, 1, 3, 2, NA)

w = c(1, 1, 3, 2, 5)

kendallTau(x, y)

kendallTau(x, w)

# Matrix and PlacketLuce ranking example 

library("PlackettLuce")
 
R = matrix(c(1, 2, 4, 3,
             1, 4, 2, 3,
             1, 2, NA, 3,
             1, 2, 4, 3,
             1, 3, 4, 2,
             1, 4, 3, 2), nrow = 6, byrow = TRUE)
colnames(R) = LETTERS[1:4]

G = group(as.rankings(R), 1:6)

mod = pltree(G ~ 1, data = G)

preds = predict(mod)

kendallTau(R, preds)

# Also returns raw values (no average) 

kendallTau(R, preds, average = FALSE)

# Choose to ignore entries with NA
R2 = matrix(c(1, 2, 4, 3,
              1, 4, 2, 3,
              NA, NA, NA, NA,
              1, 2, 4, 3,
              1, 3, 4, 2,
              1, 4, 3, 2), nrow = 6, byrow = TRUE)

kendallTau(R, R2, average = FALSE)

kendallTau(R, R2, average = TRUE)

kendallTau(R, R2, average = TRUE, na.omit = TRUE)

if (FALSE) { # interactive()
set.seed(42)
x = rnorm(100)
y = rnorm(100)
split = rep(c("Group1", "Group2", "Group3"), length.out = 100)
kendallTau_permute(x, y, split)

data("breadwheat", package = "gosset")

x = rank_tricot(breadwheat, 
                items = paste0("variety_", letters[1:3]),
                input = c("yield_best", "yield_worst"),
                validate.rankings = TRUE)

y = rank_tricot(breadwheat, 
                items = paste0("variety_", letters[1:3]),
                input = c("overall_best", "overall_worst"),
                validate.rankings = TRUE)
                
kendallTau_permute(x, y, 
                split = rep(c("Group1", "Group2", "Group3"), length.out = nrow(breadwheat)), 
                n.permutations = 100)
                
}

Arguments

Value

References

See also

Author

Examples