为了使用 Scrapy 正确发送电子邮件，我忘记了什么

2024-02-22

我想使用 Scrapy 发送电子邮件

我看了 throw 官网，发现可以这样做：

from scrapy.mail import MailSender
        from scrapy.utils.project import get_project_settings
        settings = get_project_settings()
        mailer = MailSender(mailfrom ="[email protected] /cdn-cgi/l/email-protection", smtphost="smtp.gmail.com", smtpport=465, smtppass ="MySecretPassword")
        mailer.send(to=["Ano[email protected] /cdn-cgi/l/email-protection"], subject="Some subject", body="Some body")

代码没有抛出任何异常，但没有发送邮件。

我错过了什么？

Note1:

我需要使用 Scrapy 框架，而不是纯 Python

Note2:

我不想通过使用应用默认设置mailer = MailSender.from_settings(settings)，因为正如您所看到的，我有自定义选项，并且我尝试使用默认设置，但结果相同，没有例外，但没有发送电子邮件。

我希望你能帮助我

您的代码会想到两件事。首先，邮件程序代码是否正在执行，其次，smtpuser应填充参数。

以下是使用 Scrapy 通过 Gmail 发送电子邮件的工作代码。这个答案有 4 个部分：电子邮件代码、完整示例、日志记录和 Gmail 配置。提供了完整的示例，因为需要协调一些事情才能使其正常工作。

电子邮件代码

要让 Scrapy 发送电子邮件，您可以在 Spider 类中添加以下内容（下一节中的完整示例）。这些示例让 Scrapy 在爬行完成后发送电子邮件。

有两块代码需要添加，第一块用于导入模块，第二块用于发送电子邮件。

导入模块：

from scrapy import signals
from scrapy.mail import MailSender

在你的 Spider 类定义中：

class MySpider(Spider):

    <SPIDER CODE>

    @classmethod
    def from_crawler(cls, crawler):
        spider = cls()
        crawler.signals.connect(spider.spider_closed, signals.spider_closed)
        return spider

    def spider_closed(self, spider):
        mailer = MailSender(mailfrom="[email protected] /cdn-cgi/l/email-protection",smtphost="smtp.gmail.com",smtpport=587,smtpuser="[email protected] /cdn-cgi/l/email-protection",smtppass="MySecretPassword")
        return mailer.send(to=["A[email protected] /cdn-cgi/l/email-protection"],subject="Some subject",body="Some body")

完整示例

综上所述，本示例使用位于以下位置的 dirbot 示例：

https://github.com/scrapy/dirbot https://github.com/scrapy/dirbot

只需要编辑一个文件：

./dirbot/spiders/dmoz.py

这是整个工作文件，其中导入位于顶部附近，电子邮件代码位于蜘蛛类的末尾：

from scrapy.spider import Spider
from scrapy.selector import Selector

from dirbot.items import Website

from scrapy import signals
from scrapy.mail import MailSender

class DmozSpider(Spider):
    name = "dmoz"
    allowed_domains = ["dmoz.org"]
    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/",
    ]

    def parse(self, response):
        """
        The lines below is a spider contract. For more info see:
        http://doc.scrapy.org/en/latest/topics/contracts.html

        @url http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/
        @scrapes name
        """
        sel = Selector(response)
        sites = sel.xpath('//ul[@class="directory-url"]/li')
        items = []

        for site in sites:
            item = Website()
            item['name'] = site.xpath('a/text()').extract()
            item['url'] = site.xpath('a/@href').extract()
            item['description'] = site.xpath('text()').re('-\s[^\n]*\\r')
            items.append(item)

        return items

    @classmethod
    def from_crawler(cls, crawler):
        spider = cls()
        crawler.signals.connect(spider.spider_closed, signals.spider_closed)
        return spider

    def spider_closed(self, spider):
        mailer = MailSender(mailfrom="[email protected] /cdn-cgi/l/email-protection",smtphost="smtp.gmail.com",smtpport=587,smtpuser="[email protected] /cdn-cgi/l/email-protection",smtppass="MySecretPassword")
        return mailer.send(to=["A[email protected] /cdn-cgi/l/email-protection"],subject="Some subject",body="Some body")

更新此文件后，从项目目录运行标准爬网命令来爬网并发送电子邮件：

$ scrapy crawl dmoz

Logging

通过返回的输出mailer.send方法中的spider_closed方法，Scrapy 会自动将结果添加到其日志中。以下是成功和失败的例子：

成功日志消息：

2015-03-22 23:24:30-0000 [scrapy] INFO: Mail sent OK: To=['A[email protected] /cdn-cgi/l/email-protection'] Cc=None Subject="Some subject" Attachs=0

错误日志消息 - 无法连接：

2015-03-22 23:39:45-0000 [scrapy] ERROR: Unable to send mail: To=['[email protected] /cdn-cgi/l/email-protection'] Cc=None Subject="Some subject" Attachs=0- Unable to connect to server.

错误日志消息 - 身份验证失败：

2015-03-22 23:38:29-0000 [scrapy] ERROR: Unable to send mail: To=['[email protected] /cdn-cgi/l/email-protection'] Cc=None Subject="Some subject" Attachs=0- 535 5.7.8 Username and Password not accepted. Learn more at 5.7.8 http://support.google.com/mail/bin/answer.py?answer=14257 sb4sm6116233pbb.5 - gsmtp

Gmail 配置

要将 Gmail 配置为以这种方式接受电子邮件，您需要启用“访问不太安全的应用程序”，您可以在登录帐户时通过以下 URL 执行此操作：

https://www.google.com/settings/security/lesssecureapps

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

python

python27

Gmail

Scrapy