R 中的调查包:如何设置 fpc 参数(有限总体校正)

2023-12-21

我使用与大小成比例的概率(PPS)计划从采样框架中采样了一些数据,这样我就采样了6两个变量组合的分层:gender and pre与比例:

      pre
gender  High   Low Medium
     F 0.155 0.155  0.195
     M 0.155 0.155  0.185

现在我想使用指定采样数据的设计svydesign来自 R 包"survey" https://cran.r-project.org/web/packages/survey/survey.pdf。我想知道如何定义fpc (有限总体修正) 争论?

文档说:

对于无放回的 PPS 抽样,有必要使用以下公式指定每个阶段抽样的概率:fpc论据,并且不应给出总体权重论据。

library(survey)

out <- read.csv('https://raw.githubusercontent.com/rnorouzian/d/master/out.csv')

dstrat <- svydesign(id=~1,strata=~gender+pre, data=out, pps = "brewer", fpc = ????)

如果我们想添加比例列,那么我们按“性别”、“前”分组,通过计数除以创建百分比sum的计数和left_join

out1 <-  out %>%
           group_by(gender, pre) %>% 
           summarise(n = n(), .groups = 'drop') %>%
           mutate(fpc = n/sum(n)) %>% 
           right_join(out)

或者使用adorn_percentages from janitor

library(janitor)
library(tidyr)
out1 <- out %>% 
         tabyl(gender, pre) %>% 
         adorn_percentages(denominator = "all") %>% 
         pivot_longer(cols = -gender, names_to = 'pre', 
             values_to = 'fpc') %>%
        right_join(out)

如果我们需要一个函数

f1 <- function(dat, grp_cols) {
          dat %>%
             group_by(across(all_of(grp_cols))) %>%
              summarise(n = n(), .groups = 'drop') %>%
              mutate(fpc = n/sum(n)) %>% 
              right_join(dat)
  }



f1(out, c("gender", "pre"))
#Joining, by = c("gender", "pre")
# A tibble: 200 x 11
#   gender pre       n   fpc   no. fake.name sector   pretest state email            phone      
#   <chr>  <chr> <int> <dbl> <int> <chr>     <chr>      <int> <chr> <chr>            <chr>      
# 1 F      High     31 0.155     1 Pont      Private     1352 NY    [email protected] /cdn-cgi/l/email-protection      xxx-xx-6216
# 2 F      High     31 0.155     2 Street    NGO         1438 CA    [email protected] /cdn-cgi/l/email-protection    xxx-xx-6405
# 3 F      High     31 0.155     3 Galvan    Private     1389 NY    [email protected] /cdn-cgi/l/email-protection    xxx-xx-9195
# 4 F      High     31 0.155     4 Gorman    NGO         1375 CA    [email protected] /cdn-cgi/l/email-protection    xxx-xx-1845
# 5 F      High     31 0.155     5 Jacinto   Private     1386 CA    [email protected] /cdn-cgi/l/email-protection   xxx-xx-6237
# 6 F      High     31 0.155     6 Shah      Public      1384 CA    [email protected] /cdn-cgi/l/email-protection      xxx-xx-5723
# 7 F      High     31 0.155     7 Randon    Private     1360 TX    [email protected] /cdn-cgi/l/email-protection    xxx-xx-7542
# 8 F      High     31 0.155     8 Koucherik NGO         1439 NY    [email protected] /cdn-cgi/l/email-protection xxx-xx-9137
# 9 F      High     31 0.155     9 Waters    Industry    1414 TX    [email protected] /cdn-cgi/l/email-protection    xxx-xx-7560
#10 F      High     31 0.155    10 David     Industry    1396 CA    [email protected] /cdn-cgi/l/email-protection     xxx-xx-6498
# … with 190 more rows
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

R 中的调查包:如何设置 fpc 参数(有限总体校正) 的相关文章

随机推荐