If temp
是你的数据集然后做
library(data.table)
setDT(temp)[order(-prob), list(paper_id = paste0(paper_id, collapse=", ")), by = author_id]
## author_id paper_id
## 1: 731 24943, 24943, 688974, 1267992, 1201905, 964345
## 2: 736 6889, 1201905, 126992, 94345, 249
编辑:2014 年 8 月 11 日
Since data.table
v >= 1.9.4,您可以使用非常高效的setorder
代替order
str(temp)
setorder(setDT(temp), -prob)[, list(paper_id = paste0(paper_id, collapse=", ")), by = author_id]
## author_id paper_id
## 1: 731 24943, 24943, 688974, 1267992, 1201905, 964345
## 2: 736 6889, 1201905, 126992, 94345, 249
顺便说一句,这整件事也可以使用基础 R 轻松完成(尽管不推荐用于大数据集)
aggregate(paper_id ~ author_id, temp[order(-temp$prob), ], paste, collapse = ", ")
# author_id paper_id
# 1 731 24943, 24943, 688974, 1267992, 1201905, 964345
# 2 736 6889, 1201905, 126992, 94345, 249