尽管我知道这可能更容易做到ggplot
,我总是渴望看看我是否可以达到类似的结果base R绘图工具。我将利用iris
本例中的数据。
我们首先需要确定我们的哪些列data.frame
是数字。
# returns logical which is TRUE if column p is numeric
numeric_cols <- c(rep(NA, ncol(iris)))
for(p in seq_len(ncol(iris))) {
numeric_cols[p] <- inherits(iris[, p], 'numeric')
}
然后,我们可以选择一些任意颜色的密度。在这里,我选择了三种颜色,分别对应于级别的数量iris$Species
.
my_cols <- c('blue4', 'darkorange', '#00b0a4')
adj_col <- \(x) adjustcolor(x, alpha.f = 0.2)
my_transp_cols <- c(
adj_col('blue4'), adj_col('darkorange'), adj_col('#00b0a4')
)
现在我们需要绘制密度。下面给出的函数(即plot_densities
)可以选择提供边际密度或以某些因子变量为条件的密度。如果您想获得以某些因子变量为条件的密度,只需设置include_factor
to TRUE
并将感兴趣的因子变量传递给factor
争论。
plot_densities <- \(DF, columns, include_factor = FALSE, factor) {
name_vars <- names(DF)
DF <- DF[complete.cases(DF[name_vars]), ]
## setting up plotting device
layout(matrix(seq_len(4L), ncol = 4L))
## only use the TRUEs indicating numeric columns
n_cols <- length(columns[columns])
## if densities are to be shown per factor level
if (include_factor) {
par(mar = c(5, 4, 4, 8) + 0.1, xpd = TRUE)
lvls <- unique(levels(DF[[factor]]))
for (i in seq_len(n_cols)) {
## preallocation
max_y <- max_x <- min_x <- rep(NA, length(unique(levels(DF[[factor]]))))
means <- SDs <- rep(NA, length(unique(levels(DF[[factor]]))))
no_of_levels <- length(lvls)
for (j in seq_len(no_of_levels)) {
## only proceed with this loop if column i is numeric else next
if (columns[i]) {
## subset consisting values of column i for factor level j
sub <- subset(DF, DF[[factor]] %in% lvls[j])[, i]
## make sure that the densities of column i per factor level j
## are depicted in the same panel
if (j == 1) {
## limits for the x and y axes per panel for column i
for (k in seq_len(no_of_levels)) {
sub_k <- subset(DF, DF[[factor]] %in% lvls[k])[, i]
x <- density(sub_k)$x
y <- density(sub_k)$y
min_x[k] <- min(x)
max_x[k] <- max(x)
max_y[k] <- max(y)
}
## mean and SD for column i per factor level j
r <- \(x) format(round(x, 1L), nsmall = 1L)
for (kk in seq_len(no_of_levels)) {
sub_kk <- subset(DF, DF[[factor]] %in% lvls[kk])[, i]
means[kk] <- r(mean(sub_kk, na.rm = TRUE))
SDs[kk] <- r(sd(sub_kk, na.rm = TRUE))
}
x_lim <- c(min(min_x), max(max_x))
y_lim <- c(0L, max(max_y))
plot(density(sub), main = '',
las = 1, col = my_cols[j], xlab = '',
xlim = x_lim, ylim = y_lim, bty = 'n')
title(main = names(DF)[i], xpd = TRUE, adj = 1)
polygon(density(sub), density = -1L, col = my_transp_cols[j])
} else {
lines(density(sub), col = my_cols[j])
polygon(density(sub), density = -1L, col = my_transp_cols[j])
}
} else next
}
## add legend to the plot
legend('topright', paste0(lvls, ': ', means, ' (', SDs, ')'),
fill = my_transp_cols, bty = 'n',
inset = c(-0.5, 0.1))
}
} else {
## if densities are NOT to be shown per factor level
for (i in seq_len(n_cols)) {
par(mar = c(5, 4, 4, 8) + 0.1, xpd = TRUE)
## only proceed with this loop if column i is numeric else next
if (columns[i]) {
## mean and SD for column i
r <- \(x) format(round(x, 1L), nsmall = 1L)
means <- SDs <- rep(NA, n_cols)
for(j in seq_len(n_cols)) {
means[j] <- r(mean(DF[, j], na.rm = TRUE))
SDs[j] <- r(sd(DF[, j], na.rm = TRUE))
}
plot(density(DF[, i]),
las = 1, main = names(DF)[i], col = my_cols[1L], xlab = '',
bty = 'n')
polygon(density(DF[, i]), density = -1L, col = my_transp_cols[1L])
## add legend to the plot
legend('topright', paste0(names(DF)[i], ': ', means[i], ' (', SDs[i], ')'),
fill = my_transp_cols[1L], bty = 'n',
inset = c(-0.5, 0.1))
} else next
}
}
}
我们可以将输出保存为 .pdf 文件。如果你想改变layout
绘图设备,比你还必须玩一点width
and height
使其适合您的具体情况。
# marginal densities
pdf(file = 'my_directory/my_plot.pdf', # change my_directory
width = 13, height = 4)
plot_densities(DF = iris, columns = numeric_cols)
dev.off()
# conditional densities
pdf(file = 'my_directory/my_plot2.pdf', # change my_directory
width = 13, height = 4)
plot_densities(DF = iris, columns = numeric_cols,
include_factor = TRUE, factor = 'Species')
dev.off()
我通常制作我的图的 .pdf 文件,然后使用这个在线转换工具 https://pdf2jpg.net/将它们转换为 .png 文件。
我在图例中显示了平均值 (SD),但您可以选择显示您喜欢的任何统计数据。只是改变mean(sub)
and sd(sub)
在您感兴趣的统计功能中。
Output
Marginal densities
Conditional densities
注意:使用function(x)
代替\(x)
如果您使用 R