我目前有一些数据基本上是一个因素和一个日期。这是它的简化想法。
date <- c(1901,1901,1901,1902,1902,1902,1901,1903,1902,1904,1902,1903,1903,1904,1905, 1901,1903,1902,1904,1902,1902,1903,1904,1902,1902,1901,1903,1903,1904,1905, 1905,1906,1907,1908,1901,1908,1907,1905,1906,1902,1903,1903,1903,1904,1905,1901,1901,1901,1902,1902,1902,1901,1903,1902,1904,1902,1903,1903,1904,1905,
1901,1903,1902,1904,1902,1902,1903,1904,1902,1902,1901,1903,1903,1904,1905,
1905,1906,1907,1908,1901,1908,1907,1905,1906,1902,1903,1903,1903,1904,1905,
1905,1906,1907,1908,1901,1908,1907,1920,1920,1920,1921,1921,1921,1921,1921)
genre <- sample(c("fiction","nonfiction"),105,replace=TRUE)
data <- as.data.frame(cbind(date,genre))
# I know this is not an ideal way to coerce to a numeric
data$date <- as.numeric(as.character(data$date))
到目前为止,一切都很好。不过,如果您将其绘制出来,您会注意到,数据中存在很大的空白,该线遮盖了该空白。这个情节将说明。
library(ggplot2)
ggplot(data,aes(x=date,color=genre)) + geom_line(stat='count')
我见过这个帖子 https://stackoverflow.com/questions/14821064/line-break-when-no-data-in-ggplot2这建议添加一个组,我可以这样做。
data$group <- ifelse(data$date < 1910,1,2)
ggplot(data,aes(x=date,color=genre,group=group)) + geom_line(stat='count')
因此,似乎无法保留我想要的输出色彩美感and指定一个group
, while using stat='count'
。例如,该图很好地显示了数据中的差距,但失去了基于数据的颜色/区别genre
factor:
ggplot(data,aes(x=date,color=genre,group=group)) + geom_line(stat='count')
那么,这是不可能的吗?我错过了什么吗?有没有更好的方法来做到这一点,或者我需要这样做summarize
或者以其他方式改变我的日期,这样我就不会依赖stat='count'
在策划阶段?