如何批量发送包含多个 url 的多部分 html 帖子？

2023-11-21

我正在与 gmail api 交谈，并希望对请求进行批处理。他们在这里有一个友好的指南，https://developers.google.com/gmail/api/guides/batch，这表明我应该能够使用多部分/混合并包含不同的网址。

我正在使用 Python 和 Requests 库，但不确定如何发出不同的 url。类似这样的回答如何在python中发送带有请求的“multipart/form-data”？不要提及更改该部分的选项。

我该怎么做呢？

很遗憾，requests他们的 API 不支持 multipart/mixed。这已经在几个 GitHub 问题中提出了建议（#935 and #1081），但目前还没有任何更新。如果您在搜索中搜索“混合”，这一点也会变得非常清楚requests来源并得到零结果:(

现在您有多种选择，具体取决于您想要使用 Python 和第 3 方库的程度。

谷歌API客户端

现在，这个问题最明显的答案是使用 Google 提供的官方 Python APIhere。它带有一个HttpBatchRequest可以处理您需要的批量请求的类。这详细记录在本指南.

本质上，您创建了一个HttpBatchRequest对象并将您的所有请求添加到其中。然后，图书馆会将所有内容放在一起（取自上面的指南）：

batch = BatchHttpRequest()
batch.add(service.animals().list(), callback=list_animals)
batch.add(service.farmers().list(), callback=list_farmers)
batch.execute(http=http)

现在，如果出于某种原因您不能或不会使用官方 Google 库，您将必须自己构建请求主体的部分内容。

请求+电子邮件.mime

正如我已经提到的，requests不正式支持multipart/mixed。但这并不意味着我们不能“强迫”它。当创建一个Request对象，我们可以使用files参数提供多部分数据。

files是接受以下格式的 4 元组值的字典：(filename, file_object, content_type, headers)。文件名可以为空。现在我们需要转换一个Request对象转换为文件（类）对象。我编写了一个小方法，涵盖了 Google 示例中的基本示例。它的部分灵感来自于 Google 在其 Python 库中使用的内部方法：

import requests
from email.mime.multipart import MIMEMultipart
from email.mime.nonmultipart import MIMENonMultipart

BASE_URL = 'http://www.googleapis.com/batch'

def serialize_request(request):
    '''Returns the string representation of the request'''
    mime_body = ''

    prepared = request.prepare()

    # write first line (method + uri)
    if request.url.startswith(BASE_URL):
        mime_body = '%s %s\r\n' % (request.method, request.url[len(BASE_URL):])
    else:
        mime_body = '%s %s\r\n' % (request.method, request.url)

    part = MIMENonMultipart('application', 'http')

    # write headers (if possible)
    for key, value in prepared.headers.iteritems():
        mime_body += '%s: %s\r\n' % (key, value)

    if getattr(prepared, 'body', None) is not None:
        mime_body += '\r\n' + prepared.body + '\r\n'

    return mime_body.encode('utf-8').lstrip()

该方法将改变一个requests.Request对象转换为 UTF-8 编码的字符串，稍后可以将其用作负载MIMENonMultipart对象，即不同的多部分。

现在，为了生成实际的批量请求，我们首先需要将（Google API）请求列表压缩到files词典为requests库。以下方法将获取一个列表requests.Request对象，将每个对象转换为 MIMENonMultipart，然后返回符合该结构的字典files字典：

import uuid

def prepare_requests(request_list):
    message = MIMEMultipart('mixed')
    output = {}

    # thanks, Google. (Prevents the writing of MIME headers we dont need)
    setattr(message, '_write_headers', lambda self: None)

    for request in request_list:
        message_id = new_id()
        sub_message = MIMENonMultipart('application', 'http')
        sub_message['Content-ID'] = message_id
        del sub_message['MIME-Version']

        sub_message.set_payload(serialize_request(request))

        # remove first line (from ...)
        sub_message = str(sub_message)
        sub_message = sub_message[sub_message.find('\n'):]

        output[message_id] = ('', str(sub_message), 'application/http', {})

    return output

def new_id():
    # I am not sure how these work exactly, so you will have to adapt this code
    return '<item%s:[email protected]>' % str(uuid.uuid4())[-4:]

最后，我们需要将 Content-Type 更改为多部分/表单数据 to 多部分/混合并从每个请求部分中删除 Content-Disposition 和 Content-Type 标头。这些我们生成的requests并且不能被覆盖files字典。

import re

def finalize_request(prepared):
    # change to multipart/mixed
    old = prepared.headers['Content-Type']
    prepared.headers['Content-Type'] = old.replace('multipart/form-data', 'multipart/mixed')

    # remove headers at the start of each boundary
    prepared.body = re.sub(r'\r\nContent-Disposition: form-data; name=.+\r\nContent-Type: application/http\r\n', '', prepared.body)

我已尽力使用批处理指南中的 Google 示例对此进行测试：

sheep = {
  "animalName": "sheep",
  "animalAge": "5",
  "peltColor": "green"
}

commands = []
commands.append(requests.Request('GET', 'http://www.googleapis.com/batch/farm/v1/animals/pony'))
commands.append(requests.Request('PUT', 'http://www.googleapis.com/batch/farm/v1/animals/sheep', json=sheep, headers={'If-Match': '"etag/sheep"'}))
commands.append(requests.Request('GET', 'http://www.googleapis.com/batch/farm/v1/animals', headers={'If-None-Match': '"etag/animals"'}))

files = prepare_requests(commands)

r = requests.Request('POST', 'http://www.googleapis.com/batch', files=files)
prepared = r.prepare()

finalize_request(prepared)

s = requests.Session()
s.send(prepared)

由此产生的请求应该足够接近 Google 在其批处理指南中提供的内容：

POST http://www.googleapis.com/batch
Content-Length: 1006
Content-Type: multipart/mixed; boundary=a21beebd15b74be89539b137bbbc7293

--a21beebd15b74be89539b137bbbc7293

Content-Type: application/http
Content-ID: <item8065:[email protected]>

GET /farm/v1/animals
If-None-Match: "etag/animals"

--a21beebd15b74be89539b137bbbc7293

Content-Type: application/http
Content-ID: <item5158:[email protected]>

GET /farm/v1/animals/pony

--a21beebd15b74be89539b137bbbc7293

Content-Type: application/http
Content-ID: <item0ec9:[email protected]>

PUT /farm/v1/animals/sheep
Content-Length: 63
Content-Type: application/json
If-Match: "etag/sheep"

{"animalAge": "5", "animalName": "sheep", "peltColor": "green"}

--a21beebd15b74be89539b137bbbc7293--

最后，我强烈推荐官方的 Google 库，但如果你不能使用它，你将不得不临时凑合:)

免责声明：我实际上并没有尝试将此请求发送到 Google API 端点，因为身份验证过程太麻烦了。我只是想尽可能接近批处理指南中描述的 HTTP 请求。 \r 和 \n 行结尾可能存在一些问题，具体取决于 Google 端点的严格程度。

Sources:

请求 github（特别是问题 #935 和 #1081）
请求API文档
适用于 Python 的 Google API

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)