PDFBox 是否允许从 AcroForm 中删除一个字段?

2024-03-31

我正在使用阿帕奇PDF盒子2.0.8 https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox/2.0.8并试图删除一个字段。但找不到方法来做到这一点,就像我可以用 iText 做的那样:PdfStamper.getAcroFields().removeField("signature3").

我要做的事情。最初我有带有 3 个数字签名的 PDF 模板。在某些情况下,我只需要 2 个签名,因此在这种情况下,我需要从模板中删除第三个签名。似乎我无法用 PDFBox 做到这一点,我发现最接近的事情是压平该字段,但问题是如果压平特定的 PDField (不是整个表单,而只是一个字段) - 所有其他签名都会失去其功能,看起来就像它们也被压扁了一样。 这是执行此操作的代码:

PDDocument document = PDDocument.load(file);
PDDocumentCatalog documentCatalog = document.getDocumentCatalog();
PDAcroForm acroForm = documentCatalog.getAcroForm();

List<PDField> flattenList = new ArrayList<>();
for (PDField field : acroForm.getFieldTree()) {
    if (field instanceof PDSignatureField && "signature3".equals(field.getFullyQualifiedName())) {
        flattenList.add(field);
    }
}

acroForm.flatten(flattenList, true);
document.save(dest);        
document.close();

正如蒂尔曼在评论中已经提到的那样,PDFBox 没有从字段树中删除字段的方法。尽管如此,它仍然具有操作底层 PDF 结构的方法,因此人们可以自己编写这样的方法,例如像这样:

PDField removeField(PDDocument document, String fullFieldName) throws IOException {
    PDDocumentCatalog documentCatalog = document.getDocumentCatalog();
    PDAcroForm acroForm = documentCatalog.getAcroForm();

    if (acroForm == null) {
        System.out.println("No form defined.");
        return null;
    }

    PDField targetField = null;

    for (PDField field : acroForm.getFieldTree()) {
        if (fullFieldName.equals(field.getFullyQualifiedName())) {
            targetField = field;
            break;
        }
    }
    if (targetField == null) {
        System.out.println("Form does not contain field with given name.");
        return null;
    }

    PDNonTerminalField parentField = targetField.getParent();
    if (parentField != null) {
        List<PDField> childFields = parentField.getChildren();
        boolean removed = false;
        for (PDField field : childFields)
        {
            if (field.getCOSObject().equals(targetField.getCOSObject())) {
                removed = childFields.remove(field);
                parentField.setChildren(childFields);
                break;
            }
        }
        if (!removed)
            System.out.println("Inconsistent form definition: Parent field does not reference the target field.");
    } else {
        List<PDField> rootFields = acroForm.getFields();
        boolean removed = false;
        for (PDField field : rootFields)
        {
            if (field.getCOSObject().equals(targetField.getCOSObject())) {
                removed = rootFields.remove(field);
                break;
            }
        }
        if (!removed)
            System.out.println("Inconsistent form definition: Root fields do not include the target field.");
    }

    removeWidgets(targetField);

    return targetField;
}

void removeWidgets(PDField targetField) throws IOException {
    if (targetField instanceof PDTerminalField) {
        List<PDAnnotationWidget> widgets = ((PDTerminalField)targetField).getWidgets();
        for (PDAnnotationWidget widget : widgets) {
            PDPage page = widget.getPage();
            if (page != null) {
                List<PDAnnotation> annotations = page.getAnnotations();
                boolean removed = false;
                for (PDAnnotation annotation : annotations) {
                    if (annotation.getCOSObject().equals(widget.getCOSObject()))
                    {
                        removed = annotations.remove(annotation);
                        break;
                    }
                }
                if (!removed)
                    System.out.println("Inconsistent annotation definition: Page annotations do not include the target widget.");
            } else {
                System.out.println("Widget annotation does not have an associated page; cannot remove widget.");
                // TODO: In this case iterate all pages and try to find and remove widget in all of them
            }
        }
    } else if (targetField instanceof PDNonTerminalField) {
        List<PDField> childFields = ((PDNonTerminalField)targetField).getChildren();
        for (PDField field : childFields)
            removeWidgets(field);
    } else {
        System.out.println("Target field is neither terminal nor non-terminal; cannot remove widgets.");
    }
}

(删除字段 https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/form/RemoveField.java#L172辅助方法removeField and removeWidgets)

人们可以将其应用到文档和字段,如下所示:

PDDocument document = PDDocument.load(SOURCE_PDF);

PDField field = removeField(document, "Signature1");
Assert.assertNotNull("Field not found", field);

document.save(TARGET_PDF);        
document.close();

(删除字段 https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/form/RemoveField.java#L160 test testRemoveInvisibleSignature)


PS:我不确定 PDFBox 实际上在某处缓存了多少表单相关信息。因此,我建议不要在同一文档操作会话中进一步操作表单信息,至少在没有测试的情况下是这样。

PPS:您可以在以下位置找到 TODO:removeWidgets辅助方法。如果该方法输出“小部件注释没有关联的页面;无法删除小部件”,您必须添加缺少的代码。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

PDFBox 是否允许从 AcroForm 中删除一个字段? 的相关文章

随机推荐