我正在写一个扩展survey
R 包。我的函数使用复杂的调查数据估计峰度,但我不确定如何获得这些统计数据的标准误差,例如现有函数的输出,例如svymean
and svytotal
.
svykurt <- function(
x,
design,
na.rm = FALSE,
excess = TRUE
) {
if (!inherits(design, "survey.design"))
stop("design is not a survey design")
x <- model.frame(x, design$variables, na.action = na.pass)
x <- as.matrix(x)
if (ncol(x) > 1)
stop("Only calculate kurtosis one variable at a time")
if(na.rm){
x <- x[!is.na(x)]
}
pweights <- 1/design$prob
psum <- sum(pweights)
mean_x <- svymean(x, design, na.rm = na.rm)
var_x <- svyvar(x, design, na.rm = na.rm)
m4 <- sum(pweights * (x - mean_x)^4) / psum
kurt <- m4 / var_x^2
if (excess) {
kurt <- kurt - 3
}
class(kurt) <- "svykurt"
return(kurt)
}
print.svykurt <- function(x) {
m <- as.matrix(x, ncol = 1)
rownames(m) <- names(x)
colnames(m) <- "kurtosis"
print(m)
}
目前,我只是打印没有标准错误的峰度。
data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
svykurt(x = ~api00, dclus1, na.rm = TRUE)
但是,我最终想将输出分配给类svystat
并打印包含这两个值的矩阵,如svymean(x = ~api00, dclus1, na.rm = TRUE)
.
看着survey
源代码 https://rdrr.io/cran/survey/src/R/multistage.R, 我理解了svymean
and svytotal
使用一个更高级别的函数,称为svrecvar
, e.g.,
attr(total, "var")<-v<-svyrecvar(x/design$prob,design$cluster,
design$strata, design$fpc,
postStrata=design$postStrata)
我只是不确定如何将其应用到我的案例中。