查找某个值是否在其他列的范围内

2023-12-27

我有一个数据框df看起来像这样：

Input:

df <- read.table(text = 

"ID  Q1_PM Q1_TP Q1_overall  Q2_PM  Q2_LS  Q2_overall
 1   1     2     3           1       2     2       
 2   0     NA    NA          2       1     1 
 3   2     1     1           3       4     0  
 4   1     0     2           4       0     2 
 5   NA    1     NA          0       NA    0  
 6   2     0     1           1       NA    NA"   

, header = TRUE)

期望的输出：

为了进一步解释一下，我想要的输出如下：

 ID  Q1_PM Q1_TP Q1_overall  Q2_PM  Q2_LS  Q2_overall Q1_check  Q2_check
 1   1     2     3           1       2     2          "above"   "within"
 2   0     NA    NA          2       1     1           NA       "within"
 3   2     1     1           3       4     0          "within"  "below"
 4   1     0     2           4       0     2          "above"   "within"
 5   NA    1     NA          0       NA    0           NA       "within"
 6   2     0     1           1       NA    NA         "within"   NA

解释：

示例1：

基于列中的值Q1_PM and Q1_TP，我想看看列中的值是否Q1_overall is within他们的范围与否？如果不在范围内，则该值above or below范围？为了跟踪这一点，我想添加一个附加列Q1_check.

示例2：

同样，基于以下值Q2_PM and Q2_LS，我想检查的值是否Q2_overall is within他们的范围与否？如果不在范围内，是否是above or below范围？再次，为了跟踪这一点，我想添加一个附加列Q2_check

要求：

1-为此，我想添加额外的列Q1_check and Q2_check其中第一列用于涉及的比较Q1项目，第二列用于涉及的比较Q2 items.

2- 列可以包含以下值：above, below and within.

3-列命名时的情况overall have NAs，那么额外的列也可以有NAs.

我查找了相关帖子，例如：将具有取决于另一列的值的列添加到数据框中 https://stackoverflow.com/questions/50988447/add-column-with-values-depending-on-another-column-to-a-dataframe and 通过将数字列与固定值进行比较来创建类别 https://stackoverflow.com/questions/15016723/create-categories-by-comparing-a-numeric-column-with-a-fixed-value但我遇到了如下所述的错误。

部分解决方案：

我能想到的唯一解决方案是：

df$Q1_check <- ifelse(data$Q1_overall < data$Q1_PM, 'below',
                        ifelse(data$Q1_overall > data$Q1_TP, 'above', 
                               ifelse(is.na(data$Q1_overall), NA, 'within')))

但它会导致以下错误：Error in data$Q1_overall : object of type 'closure' is not subsettable。我不明白可能的问题是什么。

df %>%
  mutate(Regulation = case_when(Q1_overall < Q1_PM ~ 'below',
                                Q1_overall > Q1_TP ~ 'above', 
                                Q1_PM < Q1_overall < Q1_TP, 'within'))

这也会导致错误Error: unexpected '<' in: "Q1_overall > Q1_TP ~ 'above', Q1_PM < Q1_overall <"

Edit 1:

如果（比方说）列如下：

"Q1 Comm - 01 Scope Thesis"
"Q1 Comm - 02 Scope Project"
"Q1 Comm - 03 Learn Intern"
"Q1 Comm - 04 Biography"
"Q1 Comm - 05 Exhibit"
"Q1 Comm - 06 Social Act"
"Q1 Comm - 07 Post Project"
"Q1 Comm - 08 Learn Plant"
"Q1 Comm - 09 Study Narrate"
"Q1 Comm - 10 Learn Participate"
"Q1 Comm - 11 Write 1"
"Q1 Comm - 12 Read 2"
"Q1 Comm - Overall Study Plan"

怎样才能识别柱子何时Q1 Comm - Overall Study Plan is:

1 - Below the min()所有其他列，或

2 - Above the max()所有其他列，或

3 - Within所有其他列的范围

Edit 2:

对于更新的字段，我还包括dput(df)

dput(df)

structure(list(ï..ID = c(10L, 31L, 225L, 243L), Q1.Comm...01.Scope.Thesis = c(NA, 
2L, 0L, NA), Q1.Comm...02.Scope.Project = c(NA, NA, NA, 2L), 
    Q1.Comm...03.Learn.Intern = c(4L, NA, NA, NA), Q1.Comm...04.Biography = c(NA, 
    NA, NA, 1L), Q1.Comm...05.Exhibit = c(4L, 2L, NA, NA), Q1.Comm...06.Social.Act = c(NA, 
    NA, NA, 3L), Q1.Comm...07.Post.Project = c(NA, NA, 3L, NA
    ), Q1.Comm...08.Learn.Plant = c(NA, NA, NA, 4L), Q1.Comm...09.Study.Narrate = c(NA, 
    NA, 0L, NA), Q1.Comm...10.Learn.Participate = c(4L, NA, NA, 
    NA), Q1.Comm...11.Write.1 = c(NA, 2L, NA, NA), Q1.Comm...12.Read.2 = c(NA, 
    NA, 1L, NA), Q1.Comm...Overall.Study.Plan = c(4L, 1L, 2L, 
    NA), X = c(NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA), X.2 = c(NA, 
    NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L
))

任何有关如何实现这一目标的建议将不胜感激。谢谢你！

似乎是一个很冗长的方法 -

library(dplyr)

comparison <- function(x, y, z) {
  case_when(is.na(z) ~ NA_character_,
            z >= x & z <= y | 
              z >= y & z <= x |
              is.na(x) & y == z |
              is.na(y) & x == z ~ 'within', 
            z > x & z > y ~ 'above', 
            TRUE ~ 'below')
}

df %>%
  mutate(Q1_check = comparison(Q1.PM, Q1.TP, Q1.overall), 
          Q2_check = comparison(Q2.PM, Q2.LS, Q2.overall))
  
         
#  ID Q1.PM Q1.TP Q1.overall Q2.PM Q2.LS Q2.overall Q1_check Q2_check
#1  1     1     2          3     1     2          2    above   within
#2  2     0    NA         NA     2     1          1     <NA>   within
#3  3     2     1          1     3     4          0   within    below
#4  4     1     0          2     4     0          2    above   within
#5  5    NA     1         NA     0    NA          0     <NA>   within
#6  6     2     0          1     1    NA         NA   within     <NA>

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)