我在 r 中有以下数据框
Service Container_Pick_Day
ABC 0
ABC 1
ABC 1
ABC 2
ABC NA
ABC 0
ABC 1
DEF NA
DEF 0
DEF 1
DEF 1
DEF 1
DEF 2
DEF 1
Column Container_Pick_Day
是数字并且由以下组成NA
价值观。
我想做的是计算Service
明智的集装箱百分比0th day,after 1 day,2 day and so on
忽略NA
values
所需的数据框是
Service Container_Pick_Day Percentage
ABC 0 (2/6)*100 = 33.33
ABC 1 (3/6)*100 = 50
ABC 2 (1/6)*100 = 16.67
DEF 0 (1/6)*100 = 16.67
DEF 1 (3/6)*100 = 50
DEF 2 (1/6)*100 = 16.67
我在 R 中做了以下操作,但它在输出中生成 NA 值
df%>%
group_by(Service) %>%
summarise(pick_day_perc = n()/sum(Container_Pick_Day),na.rm=T) %>%
as.data.frame()
我必须按以下方式分组吗Service and Container_Pick_Day
both ?
根据@nicola、@akrun 和我自己提供的上述所有评论添加答案,
library(dplyr)
#nicola
df %>%
filter(!is.na(Container_Pick_Day)) %>%
group_by(Service,Container_Pick_Day) %>%
summarise(Percentage=n()) %>%
group_by(Service) %>%
mutate(Percentage=Percentage/sum(Percentage)*100)
#akrun
df %>%
filter(complete.cases(Container_Pick_Day)) %>%
count(Service, Container_Pick_Day) %>%
group_by(Service) %>%
transmute(Container_Pick_Day, Percentage=n/sum(n)*100)
#Sotos
df %>%
na.omit() %>%
group_by_all() %>%
summarise(ptg = n()) %>%
group_by(Service) %>%
mutate(ptg = prop.table(ptg)*100)
所有的结果是,
Service Container_Pick_Day Percentage
<fctr> <int> <dbl>
1 ABC 0 33.33333
2 ABC 1 50.00000
3 ABC 2 16.66667
4 DEF 0 16.66667
5 DEF 1 66.66667
6 DEF 2 16.66667
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)