我想我已经为你找到了一个解决方案:
The logging
模块是构建为线程安全的 https://docs.python.org/2/library/logging.html#thread-safety:
日志记录模块的目的是线程安全没有任何特殊的
需要由客户完成的工作。它通过使用实现了这一点
螺纹锁;有一个锁可以序列化对模块的访问
共享数据,以及每个处理程序还创建一个锁序列化访问
到其底层 I/O。
幸运的是,它通过公共 API 公开了提到的第二个锁:Handler.acquire() https://docs.python.org/2/library/logging.html#logging.Handler.acquire让您获取特定日志处理程序的锁(并且Handler.release() https://docs.python.org/2/library/logging.html#logging.Handler.release再次释放)。获取该锁将阻止所有其他尝试记录将由该处理程序处理的记录的线程,直到锁被释放。
这允许您以线程安全的方式操纵处理程序的状态。需要注意的是:因为它的目的是作为处理程序 I/O 操作的锁,所以该锁只能在emit()
。因此,只有当一条记录通过过滤器和日志级别并由特定处理程序发出时,才会获取锁。这就是为什么我必须子类化处理程序并创建SilencableHandler
.
所以想法是这样的:
- 获取最顶层的记录器
requests
模块并停止其传播
- 创建您的定制
SilencableHandler
并将其添加到请求记录器中
- Use the
Silenced
上下文管理器有选择地沉默SilencableHandler
main.py
from Queue import Queue
from threading import Thread
from usercode import fetch_url
import logging
import requests
import time
logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)
class SilencableHandler(logging.StreamHandler):
def __init__(self, *args, **kwargs):
self.silenced = False
return super(SilencableHandler, self).__init__(*args, **kwargs)
def emit(self, record):
if not self.silenced:
super(SilencableHandler, self).emit(record)
requests_logger = logging.getLogger('requests')
requests_logger.propagate = False
requests_handler = SilencableHandler()
requests_logger.addHandler(requests_handler)
class Silenced(object):
def __init__(self, handler):
self.handler = handler
def __enter__(self):
log.info("Silencing requests logger...")
self.handler.acquire()
self.handler.silenced = True
return self
def __exit__(self, exc_type, exc_value, traceback):
self.handler.silenced = False
self.handler.release()
log.info("Requests logger unsilenced.")
NUM_THREADS = 2
queue = Queue()
URLS = [
'http://www.stackoverflow.com',
'http://www.stackexchange.com',
'http://www.serverfault.com',
'http://www.superuser.com',
'http://travel.stackexchange.com',
]
for i in range(NUM_THREADS):
worker = Thread(target=fetch_url, args=(i, queue,))
worker.setDaemon(True)
worker.start()
for url in URLS:
queue.put(url)
log.info('Starting long API request...')
with Silenced(requests_handler):
time.sleep(5)
requests.get('http://www.example.org/api')
time.sleep(5)
log.info('Done with long API request.')
queue.join()
usercode.py
import logging
import requests
import time
logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)
def fetch_url(i, q):
while True:
url = q.get()
response = requests.get(url)
logging.info("{}: {}".format(response.status_code, url))
time.sleep(i + 2)
q.task_done()
输出示例:
(注意如何调用http://www.example.org/api
未记录,并且所有尝试记录请求的线程都会在前 10 秒内被阻止)。
INFO:__main__:Starting long API request...
INFO:__main__:Silencing requests logger...
INFO:__main__:Requests logger unsilenced.
INFO:__main__:Done with long API request.
Starting new HTTP connection (1): www.stackoverflow.com
Starting new HTTP connection (1): www.stackexchange.com
Starting new HTTP connection (1): stackexchange.com
Starting new HTTP connection (1): stackoverflow.com
INFO:root:200: http://www.stackexchange.com
INFO:root:200: http://www.stackoverflow.com
Starting new HTTP connection (1): www.serverfault.com
Starting new HTTP connection (1): serverfault.com
INFO:root:200: http://www.serverfault.com
Starting new HTTP connection (1): www.superuser.com
Starting new HTTP connection (1): superuser.com
INFO:root:200: http://www.superuser.com
Starting new HTTP connection (1): travel.stackexchange.com
INFO:root:200: http://travel.stackexchange.com
线程代码基于 Doug Hellmann 的文章线程 http://pymotw.com/2/threading/ and queues http://pymotw.com/2/Queue/index.html#module-Queue.