如何处理drive api的最大导出限制大小文件

2024-03-21

我正在尝试下载一些 google doc 文件,但之后我需要使用导出方法转换为 microsoft word mimetype,它工作正常,直到找到一个大小超过 10 mb 的文件,api 文档说这是限制导出文档的大小,但我确实需要下载这些文件,我的脚本中的所有内容都工作正常,除了抛出的错误是

“此文件太大,无法导出。”。详细信息:“此文件太大,无法导出。” 那么,是否有办法避免此限制或导出内容文件夹内的文档

EDIT:我尝试下载的文档不是公开的,因此我认为我需要对请求进行身份验证才能获取内容

EDIT 2: 脚本:


SCOPES = ['https://www.googleapis.com/auth/drive.file','https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/spreadsheets']


  
def main():
    
   
    #----------------------Google drive auth-----------------------------
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    # Call the Drive v3 API
    service = build('drive', 'v3', credentials=creds)
    sheets_service = build('sheets', 'v4', credentials=creds)

    # Call the Sheets API
    sheet = sheets_service.spreadsheets()
    
   
    
    # ID of folder that contain the wanted files
    query = "'[ID OF THE FOLDER]' in parents"

    response = service.files().list(q=query,
                                spaces='drive',
                                fields='files(id, name, parents, webViewLink,exportLinks)').execute()
    
    baseURL="https://docs.google.com/document/d/"
    for document in response['files']:
        
        
        
        downloadURL=baseURL+document["id"]+"/export?format=doc"
        

            
        r = requests.get(downloadURL)  
        
        with open('pathtosabe, 'wb') as f:

            f.write(r.content)
            
   
  
        

            
        
main()

来自您的关注replying https://stackoverflow.com/questions/66339632/how-to-handle-the-maximun-export-limit-size-file-for-drive-api/66344211#comment117290716_66339632,

好吧,这就是问题所在,我不知道如何在下载文件的请求中使用访问令牌,但内容显示为已损坏我尝试使用公共文档,内容是可见的

我认为当您的 Google 文档未公开共享时,当访问令牌用于您的脚本时r = requests.get(downloadURL),它可能会起作用。因此,在这个答案中,我想建议使用从脚本的授权脚本中检索到的访问令牌来修改脚本。

修改后的脚本:

creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
    with open('token.pickle', 'rb') as token:
        creds = pickle.load(token)

if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    else:
        flow = InstalledAppFlow.from_client_secrets_file(
            'credentials.json', SCOPES)
        creds = flow.run_local_server(port=0)
    # Save the credentials for the next run
    with open('token.pickle', 'wb') as token:
        pickle.dump(creds, token)

# Call the Drive v3 API
service = build('drive', 'v3', credentials=creds)
sheets_service = build('sheets', 'v4', credentials=creds)

# Call the Sheets API
sheet = sheets_service.spreadsheets()

# ID of folder that contain the wanted files
query = "'[ID OF THE FOLDER]' in parents"
response = service.files().list(q=query,
                            spaces='drive',
                            fields='files(id, name, parents, webViewLink,exportLinks)').execute()

access_token = creds.token # Added
baseURL="https://docs.google.com/document/d/"
for document in response['files']:
    downloadURL=baseURL+document["id"]+"/export?format=doc"
    r = requests.get(downloadURL, headers={'Authorization': 'Bearer ' + access_token})  # Modified
    with open('pathtosabe', 'wb') as f:  # Modified
        f.write(r.content)
  • 在你的脚本中,'pathtosabe, of with open('pathtosabe, 'wb') as f:不包含在单引号中。请小心这一点。如果你想使用pathtosabe作为变量,请声明并修改为with open(pathtosabe, 'wb') as f:.
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何处理drive api的最大导出限制大小文件 的相关文章

随机推荐