我有一个数据框,其中包含参与者对两个文本的判断。假设每个文本都有正确答案和标识符,并且每个文本都被判断多次。
set.seed(123)
wide_df = data.frame('participant_id' = LETTERS[1:12]
, 'judgment_1' = round(rnorm(12)*100)
, 'correct_1' = round(rnorm(12)*100)
, 'text_id_1' = sample(1:12, 12, replace = F)
, 'judgment_2' = round(rnorm(12)*100)
, 'correct_2' = round(rnorm(12)*100)
, 'text_id_2' = sample(13:24, 12, replace = F)
)
So that:
participant_id judgment_1 correct_1 text_id_1 judgment_2 correct_2 text_id_2
1 A -56 40 4 43 -127 17
2 B -23 11 10 -30 217 14
3 C 156 -56 1 90 121 22
4 D 7 179 12 88 -112 15
5 E 13 50 7 82 -40 13
...
我想将其转换为带有列的长格式:
participant_id text_id judgment correct
A 4 -56 40
A 17 43 127
...
我发现并遵循了SO建议here https://stackoverflow.com/a/24151902/3421089:
wide_df %>%
gather(v, value, judgment_1:text_id_2) %>%
separate(v, c("var", "col")) %>%
arrange(participant_id) %>%
spread(col, value)
但这种重塑方式会返回错误Error: Duplicate identifiers for rows (3, 6), (9, 12)
我认为我在概念上做了一些错误的事情,但无法完全找到它。我的错误在哪里?谢谢!