您可以尝试使用devel
的版本data.table
ie. v1.9.5
。安装开发版本的说明是here https://github.com/Rdatatable/data.table/wiki/Installation
library(data.table)#v1.9.5+
setDT(df1)[, daily_v_ID:= ifelse((1:.N)==1L, uniqueN(v), NA) , by = .(ID, Date)]
Or
setDT(df1)[, daily_v_ID := c(uniqueN(v), rep(NA, .N-1)), by = .(ID, Date)]
或者按照@David Arenburg 的建议
indx <- setDT(df1)[, .(.I[1L], uniqueN(v)), by = .(ID, Date)]
df1[indx$V1, daily_v_ID := indx$V2]
或者使用dplyr
library(dplyr)
df1 %>%
group_by(ID,Date) %>%
mutate(daily_v_ID= ifelse(row_number()==1, n_distinct(v), NA))
Or with base R
df1$daily_v_ID <- with(df1, ave(as.numeric(factor(v)), Date,ID,
FUN= function(x) NA^(seq_along(x)!=1)*length(unique(x))))
Update
对于编辑后的帖子,我们通过获取变量来创建一个变量('daily_v_ID')length(v)
或在data.table
, 我们可以用.N
setDT(df1)[, c('daily_v_distinguish_ID', 'daily_v_ID'):= list( c(uniqueN(v),
rep(NA, .N-1)), .N), by = .(ID, Date)]
df1
# ID Date v daily_v_distinguish_ID daily_v_ID
# 1: ID1 1 v1 2 3
# 2: ID1 1 v1 NA 3
# 3: ID1 1 v8 NA 3
# 4: ID1 2 v5 2 2
# 5: ID1 2 v3 NA 2
# 6: ID1 3 v3 1 1
# 7: ID2 1 v7 1 1
# 8: ID2 2 v15 1 2
# 9: ID2 2 v15 NA 2
# 10: ID2 3 v3 1 1
NOTE: uniqueN
被介绍在v1.9.5
。对于早期版本,我们可以使用unique(length(v))
或者使用dplyr
df1 %>%
group_by(ID, Date) %>%
mutate(daily_v_distinguish_ID = ifelse(row_number()==1,
n_distinct(v), NA),
daily_v_ID =n())