使用 python 缓慢上传到 azure blob 存储

2023-12-02

Api 接收文件,然后尝试创建唯一的 blob 名称。 然后我将 4MB 的块上传到 blob。每个块大约需要 8 秒,这正常吗?我的上传速度是110Mbps。我尝试上传一个 50MB 的文件,花了将近 2 分钟。我不知道azure_blob_storage版本是否与此相关,我使用的是azure-storage-blob==12.14.1

import uuid
import os
from azure.storage.blob import BlobClient, BlobBlock, BlobServiceClient
import time
import uuid

@catalog_api.route("/catalog", methods=['POST'])
def catalog():
    file = request.files['file']

    url_bucket, file_name, file_type = upload_to_blob(file)


def upload_to_blob(self, file):
    file_name = file.filename
    file_type = file.content_type

    blob_client = self.generate_blob_client(file_name)
    blob_url = self.upload_chunks(blob_client, file)
    return blob_url, file_name, file_type


def generate_blob_client(self, file_name: str):
    blob_service_client = BlobServiceClient.from_connection_string(self.connection_string)
    container_client = blob_service_client.get_container_client(self.container_name)

    for _ in range(self.max_blob_name_tries):
        blob_name = self.generate_blob_name(file_name)

        blob_client = container_client.get_blob_client(blob_name)
        if not blob_client.exists():
            return blob_client
    raise Exception("Couldnt create the blob")


def upload_chunks(self, blob_client: BlobClient, file):
    block_list=[]
    chunk_size = self.chunk_size
    while True:
        read_data = file.read(chunk_size)

        if not read_data:
            print("uploaded")
            break

        print("uploading")
        blk_id = str(uuid.uuid4())
        blob_client.stage_block(block_id=blk_id,data=read_data) 
        block_list.append(BlobBlock(block_id=blk_id))

    blob_client.commit_block_list(block_list)

    return blob_client.url
    ```

我在我的环境中尝试并得到以下结果:

我尝试使用 50 mb 文件上传块大小为的 blob 存储帐户4*1024*1024从本地环境到存储帐户需要 45 秒。

Code:

import uuid
from azure.storage.blob import BlobBlock, BlobServiceClient
import  time


connection_string="<storage account connection string >"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client('test')
blob_client = container_client.get_blob_client("file.pdf")
start=time.time()
#upload data
block_list=[]
chunk_size=4*1024*1024
with  open("C:\\file.pdf",'rb') as  f:
  while  True:
        read_data = f.read(chunk_size)
        if  not  read_data:
            break  # done
        blk_id = str(uuid.uuid4())
        blob_client.stage_block(block_id=blk_id,data=read_data)
        block_list.append(BlobBlock(block_id=blk_id))

blob_client.commit_block_list(block_list)
end=time.time()
print("Time taken to upload blob:", end - start, "secs")

在上面的代码中,我在代码末尾添加了start和end的计时方法,我使用end-start过程来了解blob存储中上传文件的计时。

Console:

enter image description here

确保您的互联网速度良好,而且,我尝试了其他一些互联网速度,最多需要 78 秒。

Portal:

enter image description here

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

使用 python 缓慢上传到 azure blob 存储 的相关文章

随机推荐