我正在使用 tweepy 来数据挖掘公共推文流中的关键字。这非常简单,并且已在多个地方进行了描述:
http://runnable.com/Us9rrMiTWf9bAAW3/how-to-stream-data-from-twitter-with-tweepy-for-python http://runnable.com/Us9rrMiTWf9bAAW3/how-to-stream-data-from-twitter-with-tweepy-for-python
http://adilmoujahid.com/posts/2014/07/twitter-analytics/ http://adilmoujahid.com/posts/2014/07/twitter-analytics/
直接从第二个链接复制代码:
#Import the necessary methods from tweepy library
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
#Variables that contains the user credentials to access Twitter API
access_token = "ENTER YOUR ACCESS TOKEN"
access_token_secret = "ENTER YOUR ACCESS TOKEN SECRET"
consumer_key = "ENTER YOUR API KEY"
consumer_secret = "ENTER YOUR API SECRET"
#This is a basic listener that just prints received tweets to stdout.
class StdOutListener(StreamListener):
def on_data(self, data):
print data
return True
def on_error(self, status):
print status
if __name__ == '__main__':
#This handles Twitter authetification and the connection to Twitter Streaming API
l = StdOutListener()
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, l)
#This line filter Twitter Streams to capture data by the keywords: 'python', 'javascript', 'ruby'
stream.filter(track=['python', 'javascript', 'ruby'])
我不明白的是如何将这些数据流式传输到 python 变量中?而不是将其打印到屏幕上...我正在 ipython 笔记本中工作并希望捕获某个变量中的流,foo
直播一分钟左右后。此外,如何让流超时?它以这种方式无限期地运行。
Related:
使用 tweepy 访问 Twitter 的 Streaming API https://stackoverflow.com/questions/10970550/using-tweepy-to-access-twitters-streaming-api?rq=1