你可以仅仅用闭包修补一个嵌套函数，还是必须重复整个外部函数？

2023-12-19

我们使用的第三方库包含一个相当长的函数，该函数在其中使用了嵌套函数。我们对该库的使用触发了该函数中的错误，我们非常希望解决该错误。

不幸的是，库维护者的修复速度有点慢，但我们不想分叉该库。在他们解决问题之前，我们也无法暂缓发布。

我们更愿意使用猴子修补来解决这个问题，因为这比修补源代码更容易跟踪。然而，重复一个非常大的函数，只需替换内部函数就足够了，这感觉有点过分，并且让其他人更难看到我们到底改变了什么。我们是否被库鸡蛋的静态补丁所困扰？

内部函数依赖于变量的关闭；一个人为的例子是：

def outerfunction(*args):
    def innerfunction(val):
        return someformat.format(val)

    someformat = 'Foo: {}'
    for arg in args:
        yield innerfunction(arg)

我们想要替换的只是实施innerfunction()。实际的外部函数要长得多。当然，我们会重用封闭变量并维护函数签名。

Yes，您可以替换内部函数，即使它使用闭包。不过，你必须克服一些困难。请考虑：

您还需要将替换函数创建为嵌套函数，以确保 Python 创建相同的闭包。如果原始函数对名称有闭包foo and bar，您需要将替换定义为具有相同名称的嵌套函数。更重要的是，您需要使用这些名称以相同的顺序;闭包由索引引用。
猴子补丁总是很脆弱，并且可能会随着实现的变化而中断。这也不例外。每当您更改修补库的版本时，请重新测试您的猴子补丁。

代码对象

为了理解这是如何工作的，我将首先解释 Python 如何处理嵌套函数。 Python使用代码对象根据需要生成函数对象。每个代码对象都有一个关联的常量序列，嵌套函数的代码对象存储在该序列中：

>>> def outerfunction(*args):
...     def innerfunction(val):
...         return someformat.format(val)
...     someformat = 'Foo: {}'
...     for arg in args:
...         yield innerfunction(arg)
... 
>>> outerfunction.__code__
<code object outerfunction at 0x105b27ab0, file "<stdin>", line 1>
>>> outerfunction.__code__.co_consts
(None, <code object innerfunction at 0x10f136ed0, file "<stdin>", line 2>, 'outerfunction.<locals>.innerfunction', 'Foo: {}')

The co_consts序列是一个不可变的对象，一个元组，所以我们不能只交换内部代码对象。稍后我将展示如何生成一个新的函数对象just该代码对象被替换。

如何处理闭包

接下来，我们需要介绍闭包。在编译时，Python 确定
a) someformat不是本地名称innerfunction然后
b) 它正在关闭相同的名称outerfunction.
Python 不仅会生成字节码以生成正确的名称查找，而且还会对嵌套函数和外部函数的代码对象进行注释以记录someformat将被关闭：

>>> outerfunction.__code__.co_cellvars
('someformat',)
>>> outerfunction.__code__.co_consts[1].co_freevars
('someformat',)

您希望确保替换内部代码对象仅将那些相同的名称列为自由变量，并以相同的顺序执行。

闭包是在运行时创建的；生成它们的字节码是外部函数的一部分：

>>> import dis
>>> dis.dis(outerfunction)
  2           0 LOAD_CLOSURE             0 (someformat)
              2 BUILD_TUPLE              1
              4 LOAD_CONST               1 (<code object innerfunction at 0x10f136ed0, file "<stdin>", line 2>)
              6 LOAD_CONST               2 ('outerfunction.<locals>.innerfunction')
              8 MAKE_FUNCTION            8 (closure)
             10 STORE_FAST               1 (innerfunction)

# ... rest of disassembly omitted ...

The LOAD_CLOSURE那里的字节码创建了一个闭包someformat多变的; Python 创建与函数使用的数量一样多的闭包按照它们在内部函数中首次使用的顺序。这是稍后需要记住的重要事实。该函数本身按位置查找这些闭包：

>>> dis.dis(outerfunction.__code__.co_consts[1])
  3           0 LOAD_DEREF               0 (someformat)
              2 LOAD_METHOD              0 (format)
              4 LOAD_FAST                0 (val)
              6 CALL_METHOD              1
              8 RETURN_VALUE

The LOAD_DEREF操作码选择了位置的闭包0在这里可以访问someformat关闭。

理论上，这也意味着您可以为内部函数中的闭包使用完全不同的名称，但出于调试目的，坚持使用相同的名称更有意义。它还使得验证替换功能是否正确插入变得更加容易，因为您只需比较co_freevars如果使用相同的名称，则为元组。

`replace_inner_function()`

现在来说说交换技巧。函数与 Python 中的任何其他对象一样，都是特定类型的实例。该类型通常不会暴露，但是type()调用仍然返回它。这同样适用于代码对象，两种类型甚至都有文档：

>>> type(outerfunction)
<type 'function'>
>>> print(type(outerfunction).__doc__)
Create a function object.

  code
    a code object
  globals
    the globals dictionary
  name
    a string that overrides the name from the code object
  argdefs
    a tuple that specifies the default argument values
  closure
    a tuple that supplies the bindings for free variables
>>> type(outerfunction.__code__)
<type 'code'>
>>> print(type(outerfunction.__code__).__doc__)
code(argcount, posonlyargcount, kwonlyargcount, nlocals, stacksize,
      flags, codestring, constants, names, varnames, filename, name,
      firstlineno, lnotab[, freevars[, cellvars]])

Create a code object.  Not for the faint of heart.

（确切的参数计数和文档字符串因 Python 版本而异；Python 3.0 添加了kwonlyargcount参数，并且从 Python 3.8 开始，已添加 posonlyargcount）。

我们将使用这些类型对象来生成一个新的code具有更新常量的对象，然后是具有更新代码对象的新函数对象；以下函数与 Python 版本 2.7 到 3.8 兼容。

def replace_inner_function(outer, new_inner):
    """Replace a nested function code object used by outer with new_inner

    The replacement new_inner must use the same name and must at most use the
    same closures as the original.

    """
    if hasattr(new_inner, '__code__'):
        # support both functions and code objects
        new_inner = new_inner.__code__

    # find original code object so we can validate the closures match
    ocode = outer.__code__
    function, code = type(outer), type(ocode)
    iname = new_inner.co_name
    orig_inner = next(
        const for const in ocode.co_consts
        if isinstance(const, code) and const.co_name == iname)

    # you can ignore later closures, but since they are matched by position
    # the new sequence must match the start of the old.
    assert (orig_inner.co_freevars[:len(new_inner.co_freevars)] ==
            new_inner.co_freevars), 'New closures must match originals'

    # replace the code object for the inner function
    new_consts = tuple(
        new_inner if const is orig_inner else const
        for const in outer.__code__.co_consts)

    # create a new code object with the new constants
    try:
        # Python 3.8 added code.replace(), so much more convenient!
        ncode = ocode.replace(co_consts=new_consts)
    except AttributeError:
        # older Python versions, argument counts vary so we need to check
        # for specifics.
        args = [
            ocode.co_argcount, ocode.co_nlocals, ocode.co_stacksize,
            ocode.co_flags, ocode.co_code,
            new_consts,  # replacing the constants
            ocode.co_names, ocode.co_varnames, ocode.co_filename,
            ocode.co_name, ocode.co_firstlineno, ocode.co_lnotab,
            ocode.co_freevars, ocode.co_cellvars,
        ]
        if hasattr(ocode, 'co_kwonlyargcount'):
            # Python 3+, insert after co_argcount
            args.insert(1, ocode.co_kwonlyargcount)
        # Python 3.8 adds co_posonlyargcount, but also has code.replace(), used above
        ncode = code(*args)

    # and a new function object using the updated code object
    return function(
        ncode, outer.__globals__, outer.__name__,
        outer.__defaults__, outer.__closure__
    )

上面的函数验证了新的内部函数（可以作为代码对象或函数传入）确实将使用与原始函数相同的闭包。然后它创建新的代码和函数对象来匹配旧的outer函数对象，但用您的猴子补丁替换了嵌套函数（按名称定位）。

我们来尝试一下

为了证明上述所有内容都有效，让我们替换innerfunction将每个格式化值加 2：

>>> def create_inner():
...     someformat = None  # the actual value doesn't matter
...     def innerfunction(val):
...         return someformat.format(val + 2)
...     return innerfunction
... 
>>> new_inner = create_inner()

新的内部函数也被创建为嵌套函数；这很重要，因为它确保 Python 将使用正确的字节码来查找someformat关闭。我用了一个return语句来提取函数对象，但你也可以看看create_inner.__code__.co_consts获取代码对象。

现在我们可以修补原来的外部函数，换出just内部函数：

>>> new_outer = replace_inner_function(outerfunction, new_inner)
>>> list(outerfunction(6, 7, 8))
['Foo: 6', 'Foo: 7', 'Foo: 8']
>>> list(new_outer(6, 7, 8))
['Foo: 8', 'Foo: 9', 'Foo: 10']

原始函数回显原始值，但新返回值增加 2。

您甚至可以创建新的替换内部函数，使用fewer关闭：

>>> def demo_outer():
...     closure1 = 'foo'
...     closure2 = 'bar'
...     def demo_inner():
...         print(closure1, closure2)
...     demo_inner()
...
>>> def create_demo_inner():
...     closure1 = None
...     def demo_inner():
...         print(closure1)
...
>>> replace_inner_function(demo_outer, create_demo_inner.__code__.co_consts[1])()
foo

简而言之

因此，要完成图片：

将您的猴子补丁内部函数创建为具有相同顺序的相同闭包的嵌套函数。
使用上面的replace_inner_function()生产一个new外函数。
对原始外部函数进行猴子修补，以使用步骤 2 中生成的新外部函数。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)