我想根据比率计算为我的 data.table 创建新列。我的变量的名称有点标准,所以我认为必须有一种方法可以轻松实现这一点数据表。但是我不知道如何实现这一点。以下是我的示例数据和代码 -
set.seed(1200)
ID <- seq(1001,1100)
region <- sample(1:10,100,replace = T)
Q21 <- sample(1:5,100,replace = T)
Q22 <- sample(1:15,100,replace = T)
Q24_LOC_1 <- sample(1:8,100,replace = T)
Q24_LOC_2 <- sample(1:8,100,replace = T)
Q24_LOC_3 <- sample(1:8,100,replace = T)
Q24_LOC_4 <- sample(1:8,100,replace = T)
Q21_PAN <- sample(1:5,100,replace = T)
Q22_PAN <- sample(1:15,100,replace = T)
Q24_LOC_1_PAN <- sample(1:8,100,replace = T)
Q24_LOC_2_PAN <- sample(1:8,100,replace = T)
Q24_LOC_3_PAN <- sample(1:8,100,replace = T)
Q24_LOC_4_PAN <- sample(1:8,100,replace = T)
df1 <- as.data.table(data.frame(ID,region,Q21,Q22,Q24_LOC_1,Q24_LOC_2,Q24_LOC_3,Q24_LOC_4,Q21_PAN,Q22_PAN,Q24_LOC_1_PAN,Q24_LOC_2_PAN,Q24_LOC_3_PAN,Q24_LOC_4_PAN))
col_needed <- c("Q21","Q22","Q24_LOC_1","Q24_LOC_2","Q24_LOC_3","Q24_LOC_4")
check1 <- df1[,Q21_R := mean(Q21,na.rm = T)/mean(Q21_PAN,na.rm = T),by=region]
check1 适用于一个变量。我一直在寻找一种解决方案,可以传递所有需要的变量并在一行中计算新变量。所以在这种情况下就像通过col_needed。我也尝试了下面的代码 -
check2 <- df1[,`:=`(paste0(col_needed,"_R"),(mean(col_needed,na.rm = T)/mean(paste0(col_needed,"_PAN"),na.rm = T))),by=region][]
然而,这给了我多个警告,结果是所有 NA 都为 NA。警告是 -在mean(col_needed, na.rm = T)中:参数不是数字或逻辑:返回NA
你能建议我哪里出错了吗?