如何在R中动态插入带有模式的列?

2024-01-10

这是一个跟进问题 https://stackoverflow.com/questions/73938635/rename-a-column-based-on-the-original-columns-name-r/73938710?noredirect=1#comment130553500_73938710。我想知道我怎样才能动态地显示更大数据集中的列?

  • 理由:我执行了一个 for 循环来导入 16 个数据帧。之后,我这样做是为了合并所有数据帧:
### Merge all dataframes: (ps: I got this code here in SO :)
 mergefun <- function(x, y) merge(x, y, by= "ID", all = T)
 merged_DF <- Reduce(mergefun, dataList)

每个数据框都有一个“ID”列(每个数据框都是相同的),但它们有不同的列名称(我根据其他帖子的答案创建的列名称)。因此,

  • 我总共有(head()每个数据帧):
ID NARR_G1_50_AAA NARR_G1_50_AAC NARR_G1_50_AC NARR_G1_50_AB
ID NARR_G1_100_AAA NARR_G1_100_AAC NARR_G1_100_AC NARR_G1_100_AB
ID NARR_G1_150_AAA NARR_G1_150_AAC NARR_G1_150_AC NARR_G1_150_AB
ID NARR_G1_200_AAA NARR_G1_200_AAC NARR_G1_200_AC NARR_G1_200_AB

ID NARR_G2_50_AAA NARR_G2_50_AAC NARR_G2_50_AC NARR_G2_50_AB
ID NARR_G2_100_AAA NARR_G2_100_AAC NARR_G2_100_AC NARR_G2_100_AB
ID NARR_G2_150_AAA NARR_G2_150_AAC NARR_G2_150_AC NARR_G2_150_AB
ID NARR_G2_200_AAA NARR_G2_200_AAC NARR_G2_200_AC NARR_G2_200_AB

ID ARG_G1_50_AAA ARG_G1_50_AAC ARG_G1_50_AC ARG_G1_50_AB
ID ARG_G1_100_AAA ARG_G1_100_AAC ARG_G1_100_AC ARG_G1_100_AB
ID ARG_G1_150_AAA ARG_G1_150_AAC ARG_G1_150_AC ARG_G1_150_AB
ID ARG_G1_200_AAA ARG_G1_200_AAC ARG_G1_200_AC ARG_G1_200_AB

ID ARG_G2_50_AAA ARG_G2_50_AAC ARG_G2_50_AC ARG_G2_50_AB
ID ARG_G2_100_AAA ARG_G2_100_AAC ARG_G2_100_AC ARG_G2_100_AB
ID ARG_G2_150_AAA ARG_G2_150_AAC ARG_G2_150_AC ARG_G2_150_AB
ID ARG_G2_200_AAA ARG_G2_200_AAC ARG_G2_200_AC ARG_G2_200_AB

我需要按这两个顺序排列连接的数据框列:

SET 1 :

###Desired output 1:
NARR_G1_50_AAA, NARR_G2_50_AAA,
NARR_G1_50_AAC, NARR_G2_50_AAC,  
NARR_G1_50_AC, NARR_G2_50_AC, 
NARR_G1_50_AB, NARR_G2_50_AB,
ARG_G1_50_AAA, ARG_G2_50_AAA,
ARG_G1_50_AAC, ARG_G2_50_AAC,  
ARG_G1_50_AC, ARG_G2_50_AC, 
ARG_G1_50_AB, ARG_G2_50_AB........then with 100,150 and 200

SET 2 :

###Desired output 2:
NARR_G1_50_AAA, ARG_G1_50_AAA, 
NARR_G2_50_AAA, ARG_G2_50_AAA,  
NARR_G1_50_AAC, ARG_G1_50_AAC, 
NARR_G2_50_AAC, ARG_G2_50_AAC,
NARR_G1_50_AC, ARG_G1_50_AC, 
NARR_G2_50_AC, ARG_G2_50_AC,
NARR_G1_50_AB, ARG_G1_50_AB, 
NARR_G2_50_AB, ARG_G2_50_AB,........then with 100,150 and 200
  • 我尝试了很多方法,但无法获得所需的订单...我得到的越接近的是:
dfPaired <- merged_DF %>%   ###still doesn't produce the desired output
  # dplyr::select(sort(names(.))) %>% 
    dplyr::select(order(gsub("G1", "G2", names(.)))) %>% 

问题:

  • 如何在不手动插入列的情况下获得所需的订单(设置 1 和设置 2)select() ?

  • 进一步说明:

SET 1:

我需要在每个变量中插入(按递增顺序 50,然后 100,然后 150,然后 200)“G1”和“G2”。例如:NARR_G1_50_AAA、NARR_G2_50_AAA...每个数字有 4 个(AAA、AAB、AC 和 AB)

SET 2:

我需要插入(按递增顺序 50,然后 100,然后 150,然后 200)比较 G1 和 G2 的“NARR”和“ARG”。如:NARR_G1_50_AAA、NARR_G2_50_AAA...提前致谢:)


如果应该是自定义顺序,一个选项是将列名称拆分为_,然后转换为factor with levels按照我们想要的顺序指定

lvls1 <- c("NARR", "ARG")
lvls2 <- c("G1", "G2")
lvls3 <- c("AAA", "AAC", "AC", "AB")
#v1 <- names(merged_DF)[-1] # assuming 'ID' is the first column
d1 <- read.table(text = v1, header = FALSE, sep = "_")
i1 <- !sapply(d1, is.numeric)
d1[i1] <- Map(factor, d1[i1], levels =  list(lvls1, lvls2, lvls3))
v2 <- v1[do.call(order, d1[c(3, 1,4, 2)])]
library(dplyr)
merged_DF %>%
   select(ID, all_of(v2))

where v2 is

> v2
 [1] "NARR_G1_50_AAA"  "NARR_G2_50_AAA"  "NARR_G1_50_AAC"  "NARR_G2_50_AAC"  "NARR_G1_50_AC"   "NARR_G2_50_AC"   "NARR_G1_50_AB"   "NARR_G2_50_AB"  
 [9] "ARG_G1_50_AAA"   "ARG_G2_50_AAA"   "ARG_G1_50_AAC"   "ARG_G2_50_AAC"   "ARG_G1_50_AC"    "ARG_G2_50_AC"    "ARG_G1_50_AB"    "ARG_G2_50_AB"   
[17] "NARR_G1_100_AAA" "NARR_G2_100_AAA" "NARR_G1_100_AAC" "NARR_G2_100_AAC" "NARR_G1_100_AC"  "NARR_G2_100_AC"  "NARR_G1_100_AB"  "NARR_G2_100_AB" 
[25] "ARG_G1_100_AAA"  "ARG_G2_100_AAA"  "ARG_G1_100_AAC"  "ARG_G2_100_AAC"  "ARG_G1_100_AC"   "ARG_G2_100_AC"   "ARG_G1_100_AB"   "ARG_G2_100_AB"  
[33] "NARR_G1_150_AAA" "NARR_G2_150_AAA" "NARR_G1_150_AAC" "NARR_G2_150_AAC" "NARR_G1_150_AC"  "NARR_G2_150_AC"  "NARR_G1_150_AB"  "NARR_G2_150_AB" 
[41] "ARG_G1_150_AAA"  "ARG_G2_150_AAA"  "ARG_G1_150_AAC"  "ARG_G2_150_AAC"  "ARG_G1_150_AC"   "ARG_G2_150_AC"   "ARG_G1_150_AB"   "ARG_G2_150_AB" 

data

# it is a random order of the column names which is ordered in the code
v1 <- c("NARR_G1_100_AB", "NARR_G1_150_AAC", "NARR_G2_50_AB", "NARR_G1_150_AB", 
"NARR_G2_100_AAA", "NARR_G1_100_AAC", "ARG_G1_150_AC", "ARG_G2_50_AAA", 
"ARG_G2_150_AAA", "ARG_G1_50_AAA", "ARG_G2_100_AC", "NARR_G1_150_AAA", 
"NARR_G2_100_AC", "ARG_G1_50_AC", "NARR_G1_100_AAA", "ARG_G2_50_AB", 
"NARR_G1_150_AC", "ARG_G2_50_AAC", "ARG_G2_150_AB", "NARR_G2_100_AAC", 
"NARR_G2_150_AAA", "NARR_G1_100_AC", "ARG_G1_150_AB", "ARG_G1_50_AAC", 
"NARR_G1_50_AC", "ARG_G2_150_AAC", "NARR_G1_50_AAA", "NARR_G2_150_AB", 
"NARR_G2_150_AAC", "ARG_G1_150_AAA", "ARG_G2_50_AC", "NARR_G2_50_AC", 
"ARG_G1_150_AAC", "ARG_G1_100_AC", "ARG_G1_100_AAA", "NARR_G1_50_AAC", 
"NARR_G2_150_AC", "ARG_G1_100_AAC", "ARG_G2_100_AAA", "ARG_G2_100_AAC", 
"NARR_G1_50_AB", "NARR_G2_100_AB", "ARG_G2_100_AB", "ARG_G1_50_AB", 
"NARR_G2_50_AAA", "ARG_G1_100_AB", "ARG_G2_150_AC", "NARR_G2_50_AAC"
)
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何在R中动态插入带有模式的列? 的相关文章

随机推荐