如何使用 forcats 根据另一个变量的子集(方面)对因子重新排序?

2023-12-14

forcats vignette指出

forcats 包的目标是提供一套有用的工具 解决常见问题的因素

事实上,其中一个工具是通过另一个变量对因子进行重新排序,这是绘制数据中非常常见的用例。我试图使用forcats来实现这一点,但在多面情节的情况下。也就是说,我想通过其他变量对因子重新排序,但仅使用数据的子集。这是一个代表:

library(tidyverse)

ggplot2::diamonds %>% 
    group_by(cut, clarity) %>% 
    summarise(value = mean(table, na.rm = TRUE)) %>%
    ggplot(aes(x = clarity, y = value, color = clarity)) + 
    geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                 size = 1.5, alpha = 0.5) + 
    geom_point(size = 3) + 
    facet_grid(rows = "cut", scales = "free") +
    coord_flip() +
    theme(legend.position = "none")

这段代码产生的图接近我想要的:

enter image description here

但我希望净度轴按值排序,这样我就可以快速找出哪个净度具有最高值。但每个方面都意味着不同的顺序。所以我想选择按特定方面内的值对图进行排序。

直接使用forcats当然,在这种情况下不起作用,因为它会根据所有值(而不仅仅是特定方面的值)对因子进行重新排序。我们开始做吧:

# Inserting this line right before the ggplot call
mutate(clarity = forcats::fct_reorder(clarity, value)) %>%

It then produces this plot. enter image description here

当然,它根据整个数据对因子进行了重新排序,但是如果我想要按“理想”切割的值对图进行排序,该怎么办?forcats?

我当前的解决方案如下:

ggdf <- ggplot2::diamonds %>% 
    group_by(cut, clarity) %>% 
    summarise(value = mean(table, na.rm = TRUE))

# The trick would be to create an auxiliary factor using only
# the subset of the data I want, and then use the levels
# to reorder the factor in the entire dataset.
#
# Note that I use good-old reorder, and not the forcats version
# which I could have, but better this way to emphasize that
# so far I haven't found the advantage of using forcats 
reordered_factor <- reorder(ggdf$clarity[ggdf$cut == "Ideal"], 
                            ggdf$value[ggdf$cut == "Ideal"])

ggdf$clarity <- factor(ggdf$clarity, levels = levels(reordered_factor))

ggdf %>%
    ggplot(aes(x = clarity, y = value, color = clarity)) + 
    geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                 size = 1.5, alpha = 0.5) + 
    geom_point(size = 3) + 
    facet_grid(rows = "cut", scales = "free") +
    coord_flip() +
    theme(legend.position = "none")

这会产生我想要的东西。

enter image description here

但我想知道是否有一种更优雅/聪明的方法来使用forcats.


如果您想重新订购clarity通过你必须告诉的特定方面的价值观forcats::fct_reorder()这样做,例如

mutate(clarity = forcats::fct_reorder(
    clarity, filter(., cut == "Ideal") %>% pull(value)))

它仅使用“理想”方面的值进行重新排序。

Thus,

ggplot2::diamonds %>% 
  group_by(cut, clarity) %>% 
  summarise(value = mean(table, na.rm = TRUE)) %>%
  mutate(clarity = forcats::fct_reorder(
    clarity, filter(., cut == "Ideal") %>% pull(value))) %>%
  ggplot(aes(x = clarity, y = value, color = clarity)) + 
  geom_segment(aes(xend = clarity, y = min(value), yend = value), 
               size = 1.5, alpha = 0.5) + 
  geom_point(size = 3) + 
  facet_grid(rows = "cut", scales = "free") +
  coord_flip() +
  theme(legend.position = "none")

creates

enter image description here

按照要求。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何使用 forcats 根据另一个变量的子集(方面)对因子重新排序? 的相关文章

随机推荐