我有一个DataFrame
对象如下所示:
| Row | timestamp | price | volume |
|-----|---------------------|-------|--------|
| 1 | 2011-08-14T14:14:40 | 10.40 | 0.779 |
| 2 | 2011-08-14T15:15:17 | 10.40 | 0.101 |
| 3 | 2011-08-14T15:15:17 | 10.40 | 0.316 |
| ... | ................... | ..... | ..... |
The timestamps
是非唯一的,所以我无法转换为TimeArray
在解决这个问题之前。如何折叠重复项timestamps
,取价格和数量总和的平均值?
谢谢您的指点!
您可以使用by https://dataframesjl.readthedocs.io/en/latest/split_apply_combine.html:
df = DataFrame(
cat = ["a", "b", "c","a"],
prices = [1,2,3,4],
vol = [10,20,30,40],
)
df2 = by(df, :cat) do sub
t = DataFrame(prices=mean(sub[:prices]), vol=sum(sub[:vol]))
end
df2
3×3 DataFrames.DataFrame
│ Row │ cat │ prices │ vol │
├─────┼─────┼────────┼─────┤
│ 1 │ "a" │ 2.5 │ 50 │
│ 2 │ "b" │ 2.0 │ 20 │
│ 3 │ "c" │ 3.0 │ 30 │
如果您必须按天/月/等进行一些总计,您可能也有兴趣这就是答案 https://stackoverflow.com/questions/43591630/how-to-add-row-grandtotals-substotals-in-a-julia-dataframe.
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)