Base R 有一个内置函数可以进行 Wilcoxon 测试:wilcox.test
。您可以为其提供两个数值向量或将数值变量与因子变量(具有两个级别)相关的公式。
# vector input
setosa_SL <- iris$Sepal.Length[which(iris$Species == "setosa")]
versicolor_SL <- iris$Sepal.Length[which(iris$Species == "versicolor")]
wilcox.test(setosa_SL, versicolor_SL)
Wilcoxon rank sum test with continuity correction
data: setosa_SL and versicolor_SL
W = 168.5, p-value = 8.346e-14
alternative hypothesis: true location shift is not equal to 0
# formula input
wilcox.test(Sepal.Length ~ Species, data = iris[which(iris$Species != "virginica"),])
Wilcoxon rank sum test with continuity correction
data: Sepal.Length by Species
W = 168.5, p-value = 8.346e-14
alternative hypothesis: true location shift is not equal to 0
然而,iris$Species
共有三个级别。如果我们想同时完成这三件事怎么办?
基地stats
包里还有pairwise.wilcox.test
.
pairwise.wilcox.test(iris$Sepal.Length, iris$Species)
Pairwise comparisons using Wilcoxon rank sum test with continuity correction
data: iris$Sepal.Length and iris$Species
setosa versicolor
versicolor 1.7e-13 -
virginica < 2e-16 5.9e-07
P value adjustment method: holm
现在,我怀疑您想将其绘制成图表。你需要pairwise_wilcox_test
and add_xy_position
来自rstatix
包装和stat_pvalue_manual
来自ggpubr
包裹。这pairwise_wilcox_test
函数是对基础 R 的改进pairwise.wilcox.text
因为返回一个 tibble 而不是类列表htest
.
library(rtatix)
librarr(ggpubr)
iris %>% pairwise_wilcox_test(Sepal.Length ~ Species)
# A tibble: 3 x 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Sepal.Length setosa versicolor 50 50 168. 8.35e-14 1.67e-13 ****
2 Sepal.Length setosa virginica 50 50 38.5 6.40e-17 1.92e-16 ****
3 Sepal.Length versicolor virginica 50 50 526 5.87e- 7 5.87e- 7 ****
功能add_xy_positions
添加 x 和 y 坐标信息以使该数据更适合绘图,并且stat_pvalue_manual
添加包含 p 值信息的层。
ggplot(iris, aes(x = Species, y = Sepal.Length)) +
geom_boxplot() +
stat_pvalue_manual(iris %>%
pairwise_wilcox_test(Sepal.Length ~ Species) %>%
add_xy_position())