此问题也称为“将“开始-结束”数据集转换为面板数据集”
我有一个数据框包含"name"
美国总统的任期开始和结束年份,("from"
and "to"
列)。这是一个示例:
presidents <- data.frame(
name = c("Bill Clinton", "George W. Bush", "Barack Obama"),
from = c(1993, 2001, 2009),
to = c(2001, 2009, 2012)
)
presidents
#> name from to
#> 1 Bill Clinton 1993 2001
#> 2 George W. Bush 2001 2009
#> 3 Barack Obama 2009 2012
我想创建包含两列的数据框("name"
and "year"
),每行代表总统在任的年份。因此,我需要从“开始”每年创建一个常规序列from
", to "to"
。这是我的预期结果:
name year
Bill Clinton 1993
Bill Clinton 1994
...
Bill Clinton 2000
Bill Clinton 2001
George W. Bush 2001
George W. Bush 2002
...
George W. Bush 2008
George W. Bush 2009
Barack Obama 2009
Barack Obama 2010
Barack Obama 2011
Barack Obama 2012
我知道我可以使用data.frame(name = "Bill Clinton", year = seq(1993, 2001))
为单个总统扩展事物,但我不知道如何为每位总统迭代。
我该怎么做呢?我觉得我应该知道这一点,但我却一片空白。
Update 1
好的,我已经尝试了两种解决方案,但出现错误:
foo<-structure(list(name = c("Grover Cleveland", "Benjamin Harrison", "Grover Cleveland"), from = c(1885, 1889, 1893), to = c(1889, 1893, 1897)), .Names = c("name", "from", "to"), row.names = 22:24, class = "data.frame")
ddply(foo, "name", summarise, year = seq(from, to))
Error in seq.default(from, to) : 'from' must be of length 1