我有以下有关瑞士注册车辆的数据框:
Fuel Year Region Count
Gasoline 2013 GE 169600
Diesel 2013 GE 46790
Hybrid 2013 GE 2268
Electric 2013 GE 85
Other 2013 GE 532
Gasoline 2013 VS 149232
Diesel 2013 VS 50591
Hybrid 2013 VS 1028
Electric 2013 VS 268
Other 2013 VS 261
我想在数据框中添加一个额外的“Pct”列,其中包含按年份和地区划分的燃料类型的百分比,但我很难理解如何添加。结果应该是:
Fuel Year Region Count Pct
Gasoline 2013 GE 169600 0.7734
Diesel 2013 GE 46790 0.2134
Hybrid 2013 GE 2268 0.0134
Electric 2013 GE 85 0.0004
Other 2013 GE 532 0.0024
Gasoline 2013 VS 149232 0.7410
Diesel 2013 VS 50591 0.2512
Hybrid 2013 VS 1028 0.0051
Electric 2013 VS 268 0.0013
Other 2013 VS 261 0.0013
这是一个很好的用例ave
然后是简单的向量除法:
# load your data
d <- read.table(text="Fuel Year Region Count
Gasoline 2013 GE 169600
Diesel 2013 GE 46790
Hybrid 2013 GE 2268
Electric 2013 GE 85
Other 2013 GE 532
Gasoline 2013 VS 149232
Diesel 2013 VS 50591
Hybrid 2013 VS 1028
Electric 2013 VS 268
Other 2013 VS 261", header = TRUE)
# `ave` by groups and divide
d$Pct <- d$Count/with(d, ave(Count, list(Year, Region), FUN = sum))
# or, equivalently:
# d <- within(d, Pct <- Count/ave(Count, list(Year, Region), FUN = sum))
Result:
> d
Fuel Year Region Count Pct
1 Gasoline 2013 GE 169600 0.7734579865
2 Diesel 2013 GE 46790 0.2133850188
3 Hybrid 2013 GE 2268 0.0103431764
4 Electric 2013 GE 85 0.0003876411
5 Other 2013 GE 532 0.0024261772
6 Gasoline 2013 VS 149232 0.7410467772
7 Diesel 2013 VS 50591 0.2512215712
8 Hybrid 2013 VS 1028 0.0051047770
9 Electric 2013 VS 268 0.0013308174
10 Other 2013 VS 261 0.0012960572
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)