pipeline.py代码
class Examplepipeline(object):
def __init__(self):
dispatcher.connect(self.spider_opened, signal=signals.spider_opened)
dispatcher.connect(self.spider_closed, signal=signals.spider_closed)
def spider_opened(self, spider):
log.msg("opened spider %s at time %s" % (spider.name,datetime.now().strftime('%H-%M-%S')))
def process_item(self, item, spider):
log.msg("Processsing item " + item['title'], level=log.DEBUG)
def spider_closed(self, spider):
log.msg("closed spider %s at %s" % (spider.name,datetime.now().strftime('%H-%M-%S')))
在上面的spider代码中,它会显示spider的开始时间和结束时间,但是现在spider完成后,我想收到来自scrapy的“抓取已完成”的邮件。是否有可能做到这一点。如果可能的话,我们可以在Spider_Closed方法中编写该代码,任何人都可以分享一些关于如何执行此操作的示例代码。
您是否查看过文档:
http://doc.scrapy.org/en/latest/topics/email.html
文档中的基本用法
from scrapy.mail import MailSender
mailer = MailSender()
mailer.send(to=["[email protected]"], subject="Some subject", body="Some body", cc=["[email protected]"])
您也可以自己实现一些自定义的东西。例如,如果您想使用 Gmail:
def send_mail(self, message, title):
print "Sending mail..........."
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEText import MIMEText
gmailUser = '[email protected]'
gmailPassword = 'password'
recipient = 'mail_to_send_to'
msg = MIMEMultipart()
msg['From'] = gmailUser
msg['To'] = recipient
msg['Subject'] = title
msg.attach(MIMEText(message))
mailServer = smtplib.SMTP('smtp.gmail.com', 587)
mailServer.ehlo()
mailServer.starttls()
mailServer.ehlo()
mailServer.login(gmailUser, gmailPassword)
mailServer.sendmail(gmailUser, recipient, msg.as_string())
mailServer.close()
print "Mail sent"
并像这样称呼它:
send_mail("some message", "Scraper Report")
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)