我有一些文档,我必须从 mongodb 获取并将其设置到 memcache。这是代码
import memcache
from pymongo import MongoClient
db = mongo_client.job_db.JobParsedData
jobs = db.find().sort("JobId", 1)
def set_to_memcache_raw(jobs):
print("Setting raw message to memcache")
count = 0
for item in jobs:
job_id = item.get('JobId')
job_details = item.get('JobDetails')
if job_id.strip():
count += 1
memcache_obj.set(job_id, job_details, time=72000)
if count % 1000 == 0:
print("Inserted {} keys in memcache".format(count))
if count >= 1000000:
break
但是,经过一些奇数次迭代后,代码会抛出此错误 -
Traceback (most recent call last):
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 450, in receive_message
self.sock, operation, request_id, self.max_message_size)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/network.py", line 137, in receive_message
header = _receive_data_on_socket(sock, 16)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/network.py", line 164, in _receive_data_on_socket
chunk = sock.recv(length)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "memcache-poc.py", line 56, in <module>
elapsed = time.time() - t0
File "memcache-poc.py", line 52, in main
jobs = db.find(query)
File "memcache-poc.py", line 17, in set_to_memcache_raw
print("Setting raw message to memcache")
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/cursor.py", line 1114, in next
if len(self.__data) or self._refresh():
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/cursor.py", line 1056, in _refresh
self.__max_await_time_ms))
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/cursor.py", line 873, in __send_message
**kwargs)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/mongo_client.py", line 905, in _send_message_with_response
exhaust)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/mongo_client.py", line 916, in _reset_on_error
return func(*args, **kwargs)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/server.py", line 136, in send_message_with_response
response_data = sock_info.receive_message(1, request_id)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 452, in receive_message
self._raise_connection_failure(error)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 550, in _raise_connection_failure
_raise_connection_failure(self.address, error)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 211, in _raise_connection_failure
raise AutoReconnect(msg)
pymongo.errors.AutoReconnect: xxx.xxx.xxx.xxx:27017: [Errno 104] Connection reset by peer
我已经浏览过诸如
pymongo 错误 http://api.mongodb.com/python/current/api/pymongo/errors.html
mongodb-TCP 保持活动状态 https://docs.mongodb.com/manual/faq/diagnostics/#does-tcp-keepalive-time-affect-mongodb-deployments
为什么 pymongo 抛出自动重新连接 https://stackoverflow.com/questions/28809168/why-does-pymongo-throw-autoreconnect
上面的代码中没有套接字不活动的问题,因为我的 jobs 对象是一个迭代器,每次在该对象上调用 next() 时,它都会获取下一个文档(从 mongo 本身)
我在 Azure 云上安装了 mongodb,实例的 TCP 保持活动时间为 7200 秒。我通过触发这个命令得到这个数字
sysctl net.ipv4.tcp_keepalive_time
7200
在这种情况下,在 for 循环上使用 try catch 块是否有帮助