我有一个情节(用 R 制作ggplot2
)这是对一堆文本数据进行奇异值分解的结果,所以我基本上有一个数据集,其中包含一些评论中使用的约 100 个单词和约 10 个评论类别,每个评论都有 2D 坐标。由于文本量太大以及许多重要点之间的距离有多近,我很难让情节看起来清晰。
我的数据现在的结构方式,我正在绘制 2 个不同的geom_texts
使用不同的格式和诸如此类的东西,为每个人传递一个单独的坐标数据框。这更容易,因为如果 ~10 个类别与 ~100 个术语重叠(这是次要的)就可以了,而且我想要这两个类别有相当不同的格式,但不一定有理由不能将它们放在一起数据框和geom
我想如果有人能找到解决方案。
我想做的是使用ggrepel
功能,因此 ~10 个类别相互排斥并使用shadowtext
功能使它们从彩色单词的背景中脱颖而出,但由于它们是不同的geom
我不知道如何实现这一点。
带有一些虚假数据的最小示例:
library(ggplo2)
library(ggrepel)
library(shadowtext)
dictionary <- c("spicy", "Thanksgiving", "carborator", "mixed", "cocktail", "stubborn",
"apple", "rancid", "table", "antiseptic", "sewing", "coffee", "tragic",
"nonsense", "stufing", "words", "bottle", "distillery", "green")
tibble(Dim1 = rnorm(100),
Dim2 = rnorm(100),
Term = sample(dictionary, 100, replace = TRUE),
Color = as.factor(sample.int(10, 100, replace = TRUE))) -> words
tibble(Dim1 = c(-1,-1,0,-0.5,0.25,0.25,0.3),
Dim2 = c(-1,-0.9, 0, 0, 0.25, 0.4, 0.1),
Term = c("Scotland", "Ireland", "America", "Taiwan", "Japan", "China", "New Zealand")) -> locations
#Base graph
ggplot() +
xlab("Factor 1") +
ylab("Factor 2") +
theme(legend.position = "none") +
geom_text_repel(aes(x = Dim1, y = Dim2, label = Term, color = Color),
words,
fontface = "italic", size = 8) -> p
#Cluttered and impossible to read:
p + geom_text(aes(x = Dim1, y = Dim2, label = Term),
locations,
fontface = "bold", size = 16, color = "#747474")
#I can make it repel:
p + geom_text_repel(aes(x = Dim1, y = Dim2, label = Term),
locations,
fontface = "bold", size = 16, color = "#747474")
#Or I can make the shadowtext:
p + geom_shadowtext(aes(x = Dim1, y = Dim2, label = Term),
locations,
fontface = "bold", size = 16, color = "#747474", bg.color = "white")
The results of the second plot, nicely repelling:
The results of the last plot, with these clean-looking white buffers around the category labels:
有没有办法两全其美?我尝试使用geom_label_repel
没有边框,但我认为它看起来不像阴影文本解决方案那么干净。