我试图找到数据中方差为零的任何变量(即恒定连续变量)。我想出了如何使用 lapply 来做到这一点,但我想使用 dplyr,因为我试图遵循整洁的数据原则。我可以使用 dplyr 创建一个仅包含方差的向量,但在最后一步我发现值不等于零并返回令我困惑的变量名称。
这是代码
library(PReMiuM)
library(tidyverse)
#> ── Attaching packages ───────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
#> ✔ tibble 1.4.2 ✔ dplyr 0.7.4
#> ✔ tidyr 0.7.2 ✔ stringr 1.2.0
#> ✔ readr 1.2.0 ✔ forcats 0.2.0
#> ── Conflicts ──────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
setwd("~/Stapleton_Lab/Projects/Premium/hybridAnalysis/")
# read in data from analysis script
df <- read_csv("./hybrid.csv")
#> Parsed with column specification:
#> cols(
#> .default = col_double(),
#> Exp = col_character(),
#> Pedi = col_character(),
#> Harvest = col_character()
#> )
#> See spec(...) for full column specifications.
# checking for missing variable
# df %>%
# select_if(function(x) any(is.na(x))) %>%
# summarise_all(funs(sum(is.na(.))))
# grab month for analysis
may <- df %>%
filter(Month==5)
june <- df %>%
filter(Month==6)
july <- df %>%
filter(Month==7)
aug <- df %>%
filter(Month==8)
sept <- df %>%
filter(Month==9)
oct <- df %>%
filter(Month==10)
# check for zero variance in continuous covariates
numericVars <- grep("Min|Max",names(june))
zero <- which(lapply(june[numericVars],var)==0,useNames = TRUE)
noVar <- june %>%
select(numericVars) %>%
summarise_all(var) %>%
filter_if(all, all_vars(. != 0))
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical
#> Warning in .p(.tbl[[vars[[i]]]], ...): coercing argument of type 'double'
#> to logical