Wagtail 文档:大文件(>2GB)上传失败

2024-03-19

我正在尝试使用 Wagtail 应用程序中内置的 wagtaildocs 应用程序上传文件。我已经使用 Nginx 的 Digital Ocean 教程方法设置了 Ubuntu 16.04 服务器 |鳐鱼 | Postgres

一些初步澄清:

  1. 在我的 Nginx 配置中我已经设置了client_max_body_size 10000M;
  2. 在我的生产设置中,我有以下几行:MAX_UPLOAD_SIZE = "5242880000" WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
  3. 我的文件类型是.zip
  4. 这是此时的生产测试。我只实现了一个基本的 wagtail 应用程序,没有附加模块。

因此,只要我的文件大小低于 10Gb,从配置的角度来看我应该没问题,除非我遗漏了某些内容或对拼写错误视而不见。

我已经尝试将所有配置值调整为不合理的大值。我尝试过使用其他文件扩展名,但没有改变我的错误。

我认为这与会话期间关闭 TCP 或 SSL 连接有关。我以前从未遇到过这个问题,所以我希望得到一些帮助。

这是我的错误消息:

Internal Server Error: /admin/documents/multiple/add/
Traceback (most recent call last):
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.DatabaseError: SSL SYSCALL error: Operation timed out


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/urls/__init__.py", line 102, in wrapper
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/decorators.py", line 34, in decorated_view
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/utils.py", line 151, in wrapped_view_func
    return view_func(request, *args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/vary.py", line 20, in inner_func
    response = func(*args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/documents/views/multiple.py", line 60, in add
    doc.save()
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 741, in save
    force_update=force_update, update_fields=update_fields)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 779, in save_base
    force_update, using, update_fields,
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 870, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 908, in _do_insert
    using=using, raw=raw)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1335, in execute_sql
    cursor.execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 99, in execute
    return super().execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.DatabaseError: SSL SYSCALL error: Operation timed out

这是我的设置

### base.py ###
import os

PROJECT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
BASE_DIR = os.path.dirname(PROJECT_DIR)
SECRET_KEY = os.getenv('SECRET_KEY_WAGTAILDEV')

# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/


# Application definition

INSTALLED_APPS = [
    'home',
    'search',

    'wagtail.contrib.forms',
    'wagtail.contrib.redirects',
    'wagtail.embeds',
    'wagtail.sites',
    'wagtail.users',
    'wagtail.snippets',
    'wagtail.documents',
    'wagtail.images',
    'wagtail.search',
    'wagtail.admin',
    'wagtail.core',

    'modelcluster',
    'taggit',

    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'storages',
]

MIDDLEWARE = [
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'django.middleware.security.SecurityMiddleware',

    'wagtail.core.middleware.SiteMiddleware',
    'wagtail.contrib.redirects.middleware.RedirectMiddleware',
]

ROOT_URLCONF = 'wagtaildev.urls'

TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': [
            os.path.join(PROJECT_DIR, 'templates'),
        ],
        'APP_DIRS': True,
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.debug',
                'django.template.context_processors.request',
                'django.contrib.auth.context_processors.auth',
                'django.contrib.messages.context_processors.messages',
            ],
        },
    },
]

WSGI_APPLICATION = 'wagtaildev.wsgi.application'


# Database
# https://docs.djangoproject.com/en/2.2/ref/settings/#databases

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'HOST': os.getenv('DATABASE_HOST_WAGTAILDEV'),
        'USER': os.getenv('DATABASE_USER_WAGTAILDEV'),
        'PASSWORD': os.getenv('DATABASE_PASSWORD_WAGTAILDEV') ,
        'NAME': os.getenv('DATABASE_NAME_WAGTAILDEV'),
        'PORT': '5432',
    }
}


# Password validation
# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators

AUTH_PASSWORD_VALIDATORS = [
    {
        'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
    },
]


# Internationalization
# https://docs.djangoproject.com/en/2.2/topics/i18n/

LANGUAGE_CODE = 'en-us'

TIME_ZONE = 'UTC'

USE_I18N = True

USE_L10N = True

USE_TZ = True


# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.2/howto/static-files/

STATICFILES_FINDERS = [
    'django.contrib.staticfiles.finders.FileSystemFinder',
    'django.contrib.staticfiles.finders.AppDirectoriesFinder',
]

STATICFILES_DIRS = [
    os.path.join(PROJECT_DIR, 'static'),
]

# ManifestStaticFilesStorage is recommended in production, to prevent outdated
# Javascript / CSS assets being served from cache (e.g. after a Wagtail upgrade).
# See https://docs.djangoproject.com/en/2.2/ref/contrib/staticfiles/#manifeststaticfilesstorage
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'

STATIC_ROOT = os.path.join(BASE_DIR, 'static')
STATIC_URL = '/static/'

MEDIA_ROOT = os.path.join(BASE_DIR, 'media')
MEDIA_URL = '/media/'


# Wagtail settings

WAGTAIL_SITE_NAME = "wagtaildev"

# Base URL to use when referring to full URLs within the Wagtail admin backend -
# e.g. in notification emails. Don't include '/admin' or a trailing slash
BASE_URL = 'http://example.com'

### production.py ###

from .base import *

DEBUG = True

ALLOWED_HOSTS = ['wagtaildev.wesgarlock.com', '127.0.0.1','134.209.230.125']

from wagtaildev.aws.conf import *

EMAIL_BACKEND = 'django.core.mail.backends.console.EmailBackend'

MAX_UPLOAD_SIZE = "5242880000"
WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
FILE_UPLOAD_TEMP_DIR = str(os.path.join(BASE_DIR, 'tmp'))

这是我的 Nginx 设置

server {
    listen 80;

    server_name wagtaildev.wesgarlock.com;
    client_max_body_size 10000M;

    location = /favicon.ico { access_log off; log_not_found off; }

    location / {
        include proxy_params;
        proxy_pass http://unix:/home/wesgarlock/run/wagtaildev.sock;
    }
}


我从来没能直接解决这个问题,但我确实想出了一个办法来解决它。

我不是 Wagtail 或 Django 专家,所以我确信这个答案有一个正确的解决方案,但无论如何,这就是我所做的。如果您有任何改进建议,请随时发表评论。

作为一个注释,这实际上是提醒我我做了什么的文档。此时(05-25-19)有很多冗余代码行,因为我弗兰肯斯坦将很多代码放在一起。我会加班编辑它。

以下是弗兰肯斯坦中共同创建此解决方案的教程。

  1. https://www.codingforentrepreneurs.com/blog/large-file-uploads-with-amazon-s3-django/ https://www.codingforentrepreneurs.com/blog/large-file-uploads-with-amazon-s3-django/
  2. http://docs.wagtail.io/en/v2.1.1/advanced_topics/documents/custom_document_model.html http://docs.wagtail.io/en/v2.1.1/advanced_topics/documents/custom_document_model.html
  3. https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html
  4. https://medium.com/faun/summary-667d0fdbcdae https://medium.com/faun/summary-667d0fdbcdae
  5. http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-browser-credentials-federated-id.html http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-browser-credentials-federated-id.html
  6. https://kite.com/python/examples/454/threading-wait-for-a-thread-to-finish https://kite.com/python/examples/454/threading-wait-for-a-thread-to-finish
  7. http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#usage-systemd http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#usage-systemd

可能还有其他一些原则,但这些是原则。

好的,我们开始吧。

我创建一个名为“files”的应用程序,然后使用自定义文档对 models.py 文件进行建模。您需要在设置文件中指定 WAGTAILDOCS_DOCUMENT_MODEL = 'files.LargeDocument'。我这样做的唯一原因是为了更明确地跟踪我正在改变的行为。这个自定义文档模型只是扩展了 Wagtail 中的标准文档模型。

#models.py

from django.db import models
from wagtail.documents.models import AbstractDocument
from wagtail.admin.edit_handlers import FieldPanel
# Create your models here.
class LargeDocument(AbstractDocument):

    admin_form_fields = (
        'file',
    )
    panels = [
        FieldPanel('file', classname='fn'),
    ]

接下来,您需要创建一个包含以下内容的 wagtail_hook.py 文件。

#wagtail_hook.py
from wagtail.contrib.modeladmin.options import (
    ModelAdmin, modeladmin_register)
from .models import LargeDocument
from .views import LargeDocumentAdminView


class LargeDocumentAdmin(ModelAdmin):
    model = LargeDocument

    menu_label = 'Large Documents'  # ditch this to use verbose_name_plural from model
    menu_icon = 'pilcrow'  # change as required
    menu_order = 200  # will put in 3rd place (000 being 1st, 100 2nd)
    add_to_settings_menu = False  # or True to add your model to the Settings sub-menu
    exclude_from_explorer = False # or True to exclude pages of this type from Wagtail's explorer view

    create_template_name ='large_document_index.html'

# Now you just need to register your customised ModelAdmin class with Wagtail
modeladmin_register(LargeDocumentAdmin)

这允许您做两件事:

  1. 创建一个新的菜单项用于上传大型文档,同时维护标准文档菜单项及其标准功能。
  2. 指定用于处理大型上传的自定义 html 文件。

这是html

{% extends "wagtailadmin/base.html" %}
{% load staticfiles cache %}
{% load static wagtailuserbar %}
{% load compress %}
{% load underscore_hyphan_to_space %}
{% load url_vars %}
{% load pagination_value %}

{% load static %}
{% load i18n %}

{% block titletag %}{{ view.page_title }}{% endblock %}

{% block content %}

    {% include "wagtailadmin/shared/header.html" with title=view.page_title icon=view.header_icon %}
          <!-- Google Signin Button -->
          <div class="g-signin2" data-onsuccess="onSignIn" data-theme="dark">
          </div>
          <!-- Select the file to upload -->

          <div class="input-group mb-3">
            <link rel="stylesheet" href="{% static 'css/input.css'%}"/>
            <div class="custom-file">
              <input type="file" class="custom-file-input" id="file" name="file">
              <label id="file_label" class="custom-file-label" style="width:auto!important;" for="inputGroupFile02" aria-describedby="inputGroupFileAddon02">Choose file</label>
            </div>
            <div class="input-group-append">
              <span class="input-group-text" id="file_submission_button">Upload</span>
            </div>
            <div id="start_progress"></div>
          </div>
          <div class="progress-upload">
            <div class="progress-upload-bar" role="progressbar" style="width: 100%;" aria-valuenow="100" aria-valuemin="0" aria-valuemax="100"></div>
          </div>
{% endblock %}

{% block extra_js %}
    {{ block.super }}
    {{ form.media.js }}
    <script src="https://apis.google.com/js/platform.js" async defer></script>
    <script src="https://sdk.amazonaws.com/js/aws-sdk-2.148.0.min.js"></script>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
    <script src="{% static 'js/awsupload.js' %}"></script>
{% endblock %}

{% block extra_css %}
    {{ block.super }}
    {{ form.media.css }}
    <meta name="google-signin-client_id" content="847336061839-9h651ek1dv7u1i0t4edsk8pd20d0lkf3.apps.googleusercontent.com">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">

{% endblock %}

然后我在views.py中创建了一些对象

#views.py
from django.shortcuts import render

# Create your views here.
import base64
import hashlib
import hmac
import os
import time
from rest_framework import permissions, status, authentication
from rest_framework.response import Response
from rest_framework.views import APIView
from .config_aws import (
    AWS_UPLOAD_BUCKET,
    AWS_UPLOAD_REGION,
    AWS_UPLOAD_ACCESS_KEY_ID,
    AWS_UPLOAD_SECRET_KEY
)
from .models import LargeDocument
import datetime
from wagtail.contrib.modeladmin.views import WMABaseView
from django.db.models.fields.files import FieldFile
from django.core.files import File
import urllib.request
from django.core.mail import send_mail
from .tasks import file_creator

class FilePolicyAPI(APIView):
    """
    This view is to get the AWS Upload Policy for our s3 bucket.
    What we do here is first create a LargeDocument object instance in our
    Django backend. This is to include the LargeDocument instance in the path
    we will use within our bucket as you'll see below.
    """
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        """
        The initial post request includes the filename
        and auth credientails. In our case, we'll use
        Session Authentication but any auth should work.
        """
        filename_req = request.data.get('filename')
        if not filename_req:
                return Response({"message": "A filename is required"}, status=status.HTTP_400_BAD_REQUEST)
        policy_expires = int(time.time()+5000)
        user = request.user
        username_str = str(request.user.username)
        """
        Below we create the Django object. We'll use this
        in our upload path to AWS.

        Example:
        To-be-uploaded file's name: Some Random File.mp4
        Eventual Path on S3: <bucket>/username/2312/2312.mp4
        """
        doc_obj = LargeDocument.objects.create(uploaded_by_user=user, )
        doc_obj_id = doc_obj.id
        doc_obj.title=filename_req
        upload_start_path = "{location}".format(
                    location = "LargeDocuments/",
            )
        file_extension = os.path.splitext(filename_req)
        filename_final = "{title}".format(
                    title= filename_req,
                )
        """
        Eventual file_upload_path includes the renamed file to the
        Django-stored LargeDocument instance ID. Renaming the file is
        done to prevent issues with user generated formatted names.
        """
        final_upload_path = "{upload_start_path}/{filename_final}".format(
                                 upload_start_path=upload_start_path,
                                 filename_final=filename_final,
                            )
        if filename_req and file_extension:
            """
            Save the eventual path to the Django-stored LargeDocument instance
            """
            policy_document_context = {
                "expire": policy_expires,
                "bucket_name": AWS_UPLOAD_BUCKET,
                "key_name": "",
                "acl_name": "public-read",
                "content_name": "",
                "content_length": 524288000,
                "upload_start_path": upload_start_path,

                }
            policy_document = """
            {"expiration": "2020-01-01T00:00:00Z",
              "conditions": [
                {"bucket": "%(bucket_name)s"},
                ["starts-with", "$key", "%(upload_start_path)s"],
                {"acl": "public-read"},

                ["starts-with", "$Content-Type", "%(content_name)s"],
                ["starts-with", "$filename", ""],
                ["content-length-range", 0, %(content_length)d]
              ]
            }
            """ % policy_document_context
            aws_secret = str.encode(AWS_UPLOAD_SECRET_KEY)
            policy_document_str_encoded = str.encode(policy_document.replace(" ", ""))
            url = 'https://thearchmedia.s3.amazonaws.com/'
            policy = base64.b64encode(policy_document_str_encoded)
            signature = base64.b64encode(hmac.new(aws_secret, policy, hashlib.sha1).digest())
            doc_obj.file_hash = signature
            doc_obj.path = final_upload_path

            doc_obj.save()



        data = {
            "policy": policy,
            "signature": signature,
            "key": AWS_UPLOAD_ACCESS_KEY_ID,
            "file_bucket_path": upload_start_path,
            "file_id": doc_obj_id,
            "filename": filename_final,
            "url": url,
            "username": username_str,
        }
        return Response(data, status=status.HTTP_200_OK)

class FileUploadCompleteHandler(APIView):
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        file_id = request.POST.get('file')
        size = request.POST.get('fileSize')
        data = {}
        type_ = request.POST.get('fileType')
        if file_id:
            obj = LargeDocument.objects.get(id=int(file_id))
            obj.size = int(size)
            obj.uploaded = True
            obj.type = type_
            obj.file_hash
            obj.save()
            data['id'] = obj.id
            data['saved'] = True
            data['url']=obj.url
        return Response(data, status=status.HTTP_200_OK)

class ModelFileCompletion(APIView):
    permission_classes = [permissions.IsAuthenticated]
    authentication_classes = [authentication.SessionAuthentication]

    def post(self, request, *args, **kwargs):
        file_id = request.POST.get('file')
        url = request.POST.get('aws_url')
        data = {}
        if file_id:
            obj = LargeDocument.objects.get(id=int(file_id))
            file_creator.delay(obj.pk)
            data['test'] = 'process started'
        return Response(data, status=status.HTTP_200_OK)

def LargeDocumentAdminView(request):
    context = super(WMABaseView, self).get_context(request)
    render(request, 'modeladmin/files/index.html', context)

该视图围绕标准文件处理系统。我不想放弃标准文件处理系统或编写一个新的系统。这就是为什么我称此黑客为非理想解决方案的原因。

// javascript upload file "awsupload.js"
var id_token; //token we get upon Authentication with Web Identiy Provider
function onSignIn(googleUser) {
  var profile = googleUser.getBasicProfile();
  // The ID token you need to pass to your backend:
  id_token = googleUser.getAuthResponse().id_token;
}

$(document).ready(function(){

  // setup session cookie data. This is Django-related
  function getCookie(name) {
      var cookieValue = null;
      if (document.cookie && document.cookie !== '') {
          var cookies = document.cookie.split(';');
          for (var i = 0; i < cookies.length; i++) {
              var cookie = jQuery.trim(cookies[i]);
              // Does this cookie string begin with the name we want?
              if (cookie.substring(0, name.length + 1) === (name + '=')) {
                  cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
                  break;
              }
          }
      }
      return cookieValue;
  }
  var csrftoken = getCookie('csrftoken');
  function csrfSafeMethod(method) {
      // these HTTP methods do not require CSRF protection
      return (/^(GET|HEAD|OPTIONS|TRACE)$/.test(method));
  }
  $.ajaxSetup({
      beforeSend: function(xhr, settings) {
          if (!csrfSafeMethod(settings.type) && !this.crossDomain) {
              xhr.setRequestHeader("X-CSRFToken", csrftoken);
          }
      }
  });
  // end session cookie data setup.

  // declare an empty array for potential uploaded files
  var fileItemList = []

  $(document).on('click','#file_submission_button', function(event){
      var selectedFiles = $('#file').prop('files');
      formItem = $(this).parent()
      $.each(selectedFiles, function(index, item){
          uploadFile(item)
      })
      $(this).val('');
      $('.progress-upload-bar').attr('aria-valuenow',progress);
      $('.progress-upload-bar').attr('width',progress.toString()+'%');
      $('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%');
      $('.progress-upload-bar').text(progress.toString()+'%');
  })
  $(document).on('change','#file', function(event){
      var selectedFiles = $('#file').prop('files');
      $('#file_label').text(selectedFiles[0].name)
  })



  function constructFormPolicyData(policyData, fileItem) {
     var contentType = fileItem.type != '' ? fileItem.type : 'application/octet-stream'
      var url = policyData.url
      var filename = policyData.filename
      var repsonseUser = policyData.user
      // var keyPath = 'www/' + repsonseUser + '/' + filename
      var keyPath = policyData.file_bucket_path
      var fd = new FormData()
      fd.append('key', keyPath + filename);
      fd.append('acl','private');
      fd.append('Content-Type', contentType);
      fd.append("AWSAccessKeyId", policyData.key)
      fd.append('Policy', policyData.policy);
      fd.append('filename', filename);
      fd.append('Signature', policyData.signature);
      fd.append('file', fileItem);
      return fd
  }

  function fileUploadComplete(fileItem, policyData){
      data = {
          uploaded: true,
          fileSize: fileItem.size,
          file: policyData.file_id,

      }
      $.ajax({
          method:"POST",
          data: data,
          url: "/api/files/complete/",
          success: function(data){
              displayItems(fileItemList)
          },
          error: function(jqXHR, textStatus, errorThrown){
              alert("An error occured, please refresh the page.")
          }
      })
  }

  function modelComplete(policyData, aws_url){
      data = {
          file: policyData.file_id,
          aws_url: aws_url
      }
      $.ajax({
          method:"POST",
          data: data,
          url: "/api/files/modelcomplete/",
          success:
          console.log('model complete success')  ,
          error: function(jqXHR, textStatus, errorThrown){
              alert("An error occured, please refresh the page.")
          }
      })
  }

  function displayItems(fileItemList){
      var itemList = $('.item-loading-queue')
      itemList.html("")
      $.each(fileItemList, function(index, obj){
          var item = obj.file
          var id_ = obj.id
          var order_ = obj.order
          var html_ = "<div class=\"progress\">" +
            "<div class=\"progress-bar\" role=\"progressbar\" style='width:" + item.progress + "%' aria-valuenow='" + item.progress + "' aria-valuemin=\"0\" aria-valuemax=\"100\"></div></div>"
          itemList.append("<div>" + order_ + ") " + item.name + "<a href='#' class='srvup-item-upload float-right' data-id='" + id_ + ")'>X</a> <br/>" + html_ + "</div><hr/>")

      })
  }

  function uploadFile(fileItem){
          var policyData;
          var newLoadingItem;
          // get AWS upload policy for each file uploaded through the POST method
          // Remember we're creating an instance in the backend so using POST is
          // needed.
          $.ajax({
              method:"POST",
              data: {
                  filename: fileItem.name
              },
              url: "/api/files/policy/",
              success: function(data){
                      policyData = data
              },
              error: function(data){
                  alert("An error occured, please try again later")
              }
          }).done(function(){
              // construct the needed data using the policy for AWS
              var file = fileItem;
              AWS.config.credentials = new AWS.WebIdentityCredentials({
                  RoleArn: 'arn:aws:iam::120974195102:role/thearchmedia-google-role',
                  ProviderId: null, // this is null for Google
                  WebIdentityToken: id_token // Access token from identity provider
              });
              var bucket = 'thearchmedia'
              var key = 'LargeDocuments/'+file.name
              var aws_url = 'https://'+bucket+'.s3.amazonaws.com/'+ key
              var s3bucket = new AWS.S3({params: {Bucket: bucket}});
              var params = {Key: key , ContentType: file.type, Body: file, ACL:'public-read', };
              s3bucket.upload(params, function (err, data) {
                  $('#results').html(err ? 'ERROR!' : 'UPLOADED :' + data.Location);
                }).on(
                  'httpUploadProgress', function(evt) {
                    progress = parseInt((evt.loaded * 100) / evt.total)
                    $('.progress-upload-bar').attr('aria-valuenow',progress)
                    $('.progress-upload-bar').attr('width',progress.toString()+'%')
                    $('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%')
                    $('.progress-upload-bar').text(progress.toString()+'%')

                  }).send(
                    function(err, data) {
                      alert("File uploaded successfully.")
                      fileUploadComplete(fileItem, policyData)
                      modelComplete(policyData, aws_url)
                    });
          })
  }


})

.js 和 .view.py 交互说明

首先,头部带有文件信息的 Ajax 调用会创建 Document 对象,但由于文件从不接触服务器,因此不会在 Document 对象中创建“File”对象。这个“文件”对象包含了我需要的功能,所以我需要做更多的事情。接下来,我的 javascript 文件使用 AWS Javascript SDK 将文件上传到我的 s3 存储桶。 SDK 中的 s3bucket.upload() 函数足够强大,可以上传高达 5GB 的文件,但如果不包括一些其他修改,它可以上传高达 5TB(aws 限制)。文件上传到 s3 存储桶后,将进行最终的 API 调用。最终的 API 调用会触发 Celery 任务,将文件下载到远程服务器上的临时目录。一旦文件存在于我的远程服务器上,就会创建文件对象并将其保存到文档模型中。

task.py 文件处理将文件从 S3 存储桶下载到远程服务器,然后创建 File 对象并将其保存到文档文件。

#task.py
from .models import LargeDocument
from celery import shared_task
import urllib.request
from django.core.mail import send_mail
from django.core.files import File
import threading

@shared_task
def file_creator(pk_num):
    obj = LargeDocument.objects.get(pk=pk_num)
    tmp_loc = 'tmp/'+ obj.title
    def downloadit():
        urllib.request.urlretrieve('https://thearchmedia.s3.amazonaws.com/LargeDocuments/' + obj.title, tmp_loc)

    def after_dwn():
         dwn_thread.join()           #waits till thread1 has completed executing
         #next chunk of code after download, goes here
         send_mail(
             obj.title + ' has finished to downloading to the server',
             obj.title + 'Downloaded to server',
             '[email protected] /cdn-cgi/l/email-protection',
             ['[email protected] /cdn-cgi/l/email-protection'],
             fail_silently=False,
         )
         reopen = open(tmp_loc, 'rb')
         django_file = File(reopen)
         obj.file = django_file
         obj.save()
         send_mail(
             obj.title + ' has finished to downloading to the server',
             'File Model Created for' + obj.title,
             '[email protected] /cdn-cgi/l/email-protection',
             ['[email protected] /cdn-cgi/l/email-protection'],
             fail_silently=False,
         )

    dwn_thread = threading.Thread(target=downloadit)
    dwn_thread.start()

    metadata_thread = threading.Thread(target=after_dwn)
    metadata_thread.start()

这个过程需要在 Celery 中运行,因为下载大文件需要时间,而且我不想在浏览器打开的情况下等待。此task.py 内部还有一个python thread(),它强制进程等待,直到文件成功下载到远程服务器。如果您是 Celery 新手,这里是他们文档的开始(http://docs.celeryproject.org/en/master/getting-started/introduction.html http://docs.celeryproject.org/en/master/getting-started/introduction.html)

此外,我还添加了一些电子邮件通知,以确认流程已完成。

最后一点是,我在项目中创建了一个 /tmp 目录,并设置了每日删除所有文件的操作,以赋予其 tmp 功能。

crontab -e
find ~/thearchmedia/tmp -mtime +1 -delete
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Wagtail 文档:大文件(>2GB)上传失败 的相关文章

随机推荐

  • 在 null 上调用成员函数 store() - laravel 5.4

    我正在尝试上传图像 但每次提交时都会返回 store 错误 我已将表单设置为 enctype multipart form data 这没有帮助 有人能指出我正确的方向吗 Thanks 控制器内部功能 public function sto
  • 来自 Android 的 Facebook Score API 调用未在时间轴/股票代码上显示高分

    我正在尝试让 Android 应用程序将高分发布到 Facebook 类似于 Facebook 上的 愤怒的小鸟 的做法 它显示在时间轴上 也显示在股票代码中 请记住 该游戏仅在 Android 上运行 并且没有 FB Canvas 应用程
  • GiST 和 GIN 索引之间的区别

    我正在实现一个表 其中有一列的数据类型为tsvector我想了解什么索引更好使用 GIN 还是 GiST 在浏览中postgres 文档在这里 http www postgresql org docs 9 1 static textsear
  • 模拟安全警报的解决方案 - X509TrustManager 的不安全实现

    因此 最近我在开发人员控制台中收到以下警告 为了解决该问题 我已完成了所需的修复 根据谷歌的建议 here https support google com faqs answer 6346016 要确认您已进行正确的更改 请将应用程序的更
  • CouchDB 备份和克隆数据库

    我们正在寻找 CouchdDB 作为类似 CMS 的应用程序 围绕备份我们的生产数据库有哪些常见模式 最佳实践和工作流程建议 我对克隆数据库以用于开发和测试的过程特别感兴趣 仅从实时运行的实例下复制磁盘上的文件就足够了吗 您可以在两个实时运
  • TabLayout 使用自定义视图更新选项卡内容

    我在用着TabLayout新的材料设计 我有一个问题 创建选项卡后我无法更新自定义视图的选项卡内容 我可以用以下方法简化 PagerAdapter 中的方法 public View setTabView int position boole
  • 记录器服务错误:鼠标左键按下:无法找到匹配的元素 - Xcode 错误

    我正在尝试通过 XCTest 自动化我的 mac 应用程序 当尝试从 XCode 记录应用程序时 我收到以下错误消息 当我点击按钮时会发生这种情况 按钮层次结构是 按钮 gt 堆栈视图 gt NSView 这里 button是NSButto
  • 外键和索引问题

    我正在使用 SQL Server 2008 Enterprise 我有一个表 其中一个列引用另一个表 在同一个数据库中 中的另一列作为外键 这是相关的SQL语句 更详细地说 表 Foo 中的列 AnotherID 引用了另一个表表 Goo
  • 如何使用 sass 正确避免在 HTML 上嵌入 twitter bootstrap 类名

    我正在开发一个刚刚开始的 Rails 项目 我们想使用 twitter bootstrap 作为我们样式的基础 一开始我们只是直接在 HTML 代码上使用 bootstrap 的类名 就像 bootstrap 的文档中所示 但在阅读以下文章
  • 如何检查 Python 数组中是否存在某个元素(相当于 PHP in_array)?

    我是 Python 新手 我正在寻找一个标准函数来告诉我数组中是否存在某个元素 我找到了index方法 但如果未找到该元素 则会抛出异常 我只需要一些可以返回的简单函数true如果该元素在数组中或者false if not 基本上相当于 P
  • hook_user():将额外的字段插入数据库而不仅仅是表单

    我可以在注册中添加一个额外的字段 我需要知道的是我需要采取什么步骤来获取该输入并将其插入到 drupal 的用户表中 下面的代码位于我的模块中 它仅向表单添加一个字段 但是当提交时 它不会对数据执行任何操作 function perscri
  • 如何组合两个索引不同的 pandas 系列?

    我尝试将两个不同索引的系列组合在一起 相同的行数 我试过pd concat s1 s2 axis 1 例如 s1 为 index s1 0 1 5 1 2 s2 是 index s2 a 1 b 2 但我得到 index s1 s2 0 1
  • 批量从文本文件中删除重复行

    是否可以从文本文件中删除重复的行 如果是 怎么办 当然可以 但就像大多数批处理文本文件一样 它并不漂亮 而且不是特别快 该解决方案在查找重复项时忽略大小写 并对行进行排序 文件名作为第一个也是唯一一个参数传递给批处理脚本 echo off
  • 运行 jQuery 函数 onclick

    所以我实现了一些 jQuery 它基本上通过由滑块激活的滑块来切换内容 a 标签 现在考虑一下 我宁愿让保存链接的 DIV 本身就是链接 我正在使用的 jQuery 在我的脑海中看起来像这样 a
  • 用 jQuery 收集表单中的所有项目

    如何收集 jQuery 中的所有复选框和下拉列表项进行保存 或者 对于最新版本的 jquery 您可以使用 http docs jquery com Ajax serialize http docs jquery com Ajax seri
  • 如何解决DEP6500和DEP6701错误?

    我有一个项目叫BTLE在它自己的解决方案中 加载项目并使用手机上的调试器运行它可以找到 我有第二个解决方案 可以很好地加载和编译 我添加了BTLE项目 添加 现有项目 到第二个解决方案 编译它并尝试在调试器中运行它 我可以看到应用程序已正确
  • 使用 PIG 从 Hive 表解析嵌套 XML 字符串

    我正在尝试使用 PIG 从 Hive 表中的字段而不是从 XML 文件中提取一些 XML 这是我读过的大多数示例的假设 XML 来自排列如下的表 ID XML string XML 字符串包含 n 行 始终包含最多 10 个属性中的至少一个
  • 如何在 Python 中写入原始二进制数据?

    我有一个 Python 程序 可以存储数据并将数据写入文件 数据是原始二进制数据 内部存储为str 我正在通过 utf 8 编解码器将其写出来 但是 我得到UnicodeDecodeError charmap codec can t dec
  • 使用 ASP.Net 2.0 创建 SOAP 请求

    我正在与服务器网站的技术联系人交谈 他希望我使用 Visual Studio 而我只想手写脚本 请参阅下文了解我需要生成的 SOAP 请求 我已将实际 URL 替换为虚拟 URL 正如您可能猜到的那样 我对 ASP 和 SOAP 还很陌生
  • Wagtail 文档:大文件(>2GB)上传失败

    我正在尝试使用 Wagtail 应用程序中内置的 wagtaildocs 应用程序上传文件 我已经使用 Nginx 的 Digital Ocean 教程方法设置了 Ubuntu 16 04 服务器 鳐鱼 Postgres 一些初步澄清 在我