用于多处理日志记录的 QueueHandler

2024-01-05

我正在尝试调整我的程序以将不同进程记录到单个日志文件中。 我已经寻找解决方案很多天了,但没有成功。我想我仍然不明白队列处理程序是如何工作的。在我看来,这个过程是这样的:

  • create q
  • 将 qHandler 添加到主记录器
  • 所有日志都将重定向到 q,然后 q 将使用附加到记录器的其他处理程序(通过 logger.handle(record))。 我创建了该程序的简化版本来说明记录器的行为
# logger.py

import logging
   
def listener_configurer():
    """This sets the settings for the root logger. The highest in the hierarchy. 
    All the handlers added to this root logger are available for all the subloggers.
    """
    root = logging.getLogger('main')
    file = logging.FileHandler(r'logs\temp.log', 'w')
    fmt = logging.Formatter('%(asctime)s %(processName)-10s %(name)s %(levelname)-8s %(message)s')
    stream = logging.StreamHandler()
    stream.setFormatter(fmt)
    file.setFormatter(fmt)
    root.addHandler(file)
    root.addHandler(stream)
    root.setLevel(logging.DEBUG)


def listener_process(queue):
    listener_configurer()
    while True:
        try:
            record = queue.get()
            if record is not None:
                print("-------------- using q ------------------ " + record.name + " -> " + record.message)
                logger = logging.getLogger(record.name)
                logger.handle(record)
            else:
                break
        except Exception:
            import sys, traceback
            logger.error('Whoops! Problem: %s', "problem", exc_info=1)
            traceback.print_exc(file=sys.stderr)
# saver.py (worker)
import logging
import typing

log = logging.getLogger('main.Saver')

class Saver:
    def __init__(self) -> None:
        log.warning("Instantiating a saver obj")
       
    def doStuff(self, input_line: typing.Tuple,) -> None:
        log.info(f"Exporting: {input_line}") # ASSUMING A TUPLE AS INPUT like: email, email_id, email_url
        (email, email_id, email_url, *other) = input_line
        log.info("Source URL: " + email_url)
        log.info(f"EmailName: {email}")
        log.warning(f"EmailID: {email_id}")
        log.debug("Exporting done!")
# manager.py
import logging
import logging.config
import logging.handlers
import multiprocessing
import logger
from saver import Saver

class Manager:

    def __init__(self) -> None:
        ### LOGGER
        # initializing listener -> this queue is going to be used for the multiprocessing logging
        self.queue = multiprocessing.Queue(-1)
        self.log = self.root_configurer(self.queue)  # getting a reference to the root logger -> used to log from this module
        self.listener = multiprocessing.Process(target=logger.listener_process, args=(self.queue,))
        self.listener.start()
        # utils
        self.log.info(f"Starting program at 10 am")
        # instantiate
        self.save = Saver()

    def root_configurer(self, queue):
        root = logging.getLogger('main')
        h = logging.handlers.QueueHandler(queue)  # Just the one handler needed
        root.setLevel(DEBUG)
        root.addHandler(h)
        return root # this is the main function -> we need to retrieve the root logger here

    def run(self):
        tuples = [("email1","id1","url1",""), ("email2","id2","url2",""), ("email3","id3","url3",""), ("email4","id4","url4",""), ("email4","id4","url4","")]
        procs = []
        for res in tuples:
            proc = multiprocessing.Process(target=self.save.doStuff, args=(res,)) 
            procs.append(proc)
            proc.start()
        # complete the processes
        for proc in procs:
            proc.join()
        
        self.log.debug("We reached this part!")
        # close listener
        self.queue.put_nowait(None)
        self.listener.join()   

if __name__ == "__main__":
    m = Manager()
    m.run()

我期望的是一堆像这样的行:

-------- using q -------------  main.saver INFO Source URL: ...
-------- using q -------------  main.saver INFO EmailName ...
-------- using q -------------  main.saver WARNING EmailID
-------- using q -------------  main.saver DEBUG ....

加上写入日志的所有这些行。由于某种原因我得到:

EmailID: id4
EmailID: id3
EmailID: id2
-------------- using q ------------------ main -> Starting program at 10 am
2021-07-01 11:42:16,385 MainProcess main INFO     Starting program at 10 am      
-------------- using q ------------------ main.Saver -> Instantiating a saver obj
2021-07-01 11:42:16,386 MainProcess main.Saver WARNING  Instantiating a saver obj
EmailID: id4
EmailID: id1
-------------- using q ------------------ main -> We reached this part!
2021-07-01 11:42:16,852 MainProcess main DEBUG    We reached this part!

和一个像这样的文件:

2021-07-01 11:42:16,385 MainProcess main INFO     Starting program at 10 am
2021-07-01 11:42:16,386 MainProcess main.Saver WARNING  Instantiating a saver obj
2021-07-01 11:42:16,852 MainProcess main DEBUG    We reached this part!

任何想法?

EDIT该代码取自以下内容的组合:

  • https://docs.python.org/3/howto/logging-cookbook.html#a-more-elaborate-multiprocessing-example https://docs.python.org/3/howto/logging-cookbook.html#a-more-elaborate-multiprocessing-example

and

  • https://fanchenbao.medium.com/python3-logging-with-multiprocessing-f51f460b8778 https://fanchenbao.medium.com/python3-logging-with-multiprocessing-f51f460b8778

您的工作人员不会写入队列。

您的代码似乎基于 Logging Cookbook从多个进程记录到单个文件 https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes。您可以在那里看到工作人员将队列作为参数,以配置自己(通过worker_configurer)。在您的代码中,您只配置 Manager,而不配置 Workers。

只需添加self.queue到 Process args 并复制(稍作编辑)root_configurer方法进入saver.py被调用时doStuff启动,足以按预期工作。



超出主题的挑剔(你没有要求,但它们是免费的!):

  • “根记录器”未命名,您可以通过执行以下操作来获取它logging.getLogger()(无参数)。记录器"main"因此不是根。考虑调用它main_logger反而。
  • 保留评论告诉你为什么breakrecord is None,我一开始以为是一个错误。
  • 如果你第一次get队列中的记录发生错误,您从未设置过logger变量,因此您的异常处理程序将引发UnboundLocalError在写入 stderr 之前。
  • 您未编写的信用代码:您提交的内容很大程度上基于 Logging Cookbook 示例。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

用于多处理日志记录的 QueueHandler 的相关文章

随机推荐