(有关的question https://stackoverflow.com/questions/6308933/r-concatenate-row-wise-across-specific-columns-of-dataframe这不包括排序。只需使用即可轻松使用paste
当你不需要排序时。)
我有一个结构不太理想的表,其中的字符列是通用的“item1”、“item2”等。我想创建一个新的字符变量,它是这些列按字母顺序排列、以逗号分隔的串联。例如,在第 5 行中,如果 item1 =“牛奶”、item2 =“鸡蛋”和 item3 =“黄油”,则第 5 行中的新变量可能是“黄油、鸡蛋、牛奶”
我写了一个函数f()
下面适用于两个字符变量。但是,我遇到了麻烦
- Using
mapply
或其他“矢量化”(我知道这实际上只是一个 for 循环)
- 将函数推广到任意数量的列
非常感谢任何帮助。
df <- data.frame(a =c("foo","bar"),
b= c("baz","qux"))
paste(df$a,df$b, sep=", ")
# returns [1] "foo, baz" "bar, qux" ... but I want [1] "baz, foo" "bar, qux"
f <- function(a,b) paste(c(a,b)[order(c(a,b))],collapse=", ")
f("foo","baz")
# returns [1] "baz, foo" ... which is what I want ... how to vectorize?
df$new_var <- mapply(f, df$a, df$b)
df
# a b new_var <- new_var is not what I want
# 1 foo baz 1, 2
# 2 bar qux 1, 2
# Interestingly, data.table is smart enough to fix my bad mapply
library(data.table)
dt <- data.table(a =c("foo","bar"),
b= c("baz","qux"))
dt[,new_var:=mapply(f, a, b)]
dt
# a b new_var <- new var IS what I want
# 1: foo baz baz, foo
# 2: bar qux bar, qux