如何使用 Dplyr 的 Summarize 和 which() 查找最小/最大值

2024-04-30

我有以下数据:

Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,12,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D") 
data <- data.frame(Name, Age, Group)

我想使用 dplyr

(1)按“组”对数据进行分组 (2) 显示每组内的最小和最大年龄 (3) 显示人员姓名以及最小和最大年龄

以下代码执行此操作:

data %>% group_by(Group) %>%
     summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))], 
               maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])

效果很好:

  Group minAge minAgeName maxAge maxAgeName
1     A     22        Sam     22        Sam
2     B     12      Sarah     58      James
3     C     17     Andrew     82      Sally
4     D     12     Mairin     67        Ray

但是,如果有多个最小值或最大值,我会遇到问题:

Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,31,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D") 
data <- data.frame(Name, Age, Group)

> data %>% group_by(Group) %>%
+   summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))], 
+             maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
Error: expecting a single value

我正在寻找两种解决方案:

(1) 显示最小或最大名称并不重要,只显示那个名称(即找到的第一个值) (2) 如果存在“联系”,则显示所有最小值和最大值

如果不清楚,请告诉我,并提前致谢!


您可以使用which.min and which.max得到第一个值。

data %>% group_by(Group) %>%
  summarize(minAge = min(Age), minAgeName = Name[which.min(Age)], 
            maxAge = max(Age), maxAgeName = Name[which.max(Age)])

要获取所有值,请使用例如用适当的粘贴collapse争论。

data %>% group_by(Group) %>%
  summarize(minAge = min(Age), minAgeName = paste(Name[which(Age == min(Age))], collapse = ", "), 
            maxAge = max(Age), maxAgeName = paste(Name[which(Age == max(Age))], collapse = ", "))
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何使用 Dplyr 的 Summarize 和 which() 查找最小/最大值 的相关文章

随机推荐