如何从同一台机器上运行的 Docker 加载主机中运行的 MongoDB 中的数据?

2024-04-23

我正在 Ubuntu 18.02 机器上通过以下命令运行 Pytorch docker 容器:

# Run Pytorch container image
docker run -it -v /home/ubuntu/Downloads/docker_work/test_py_app/app:/workspace/app -p 8881:8888 -p 5002:5002 --gpus all --rm nvcr.io/nvidia/pytorch:20.08-py3

在同一台机器上,我运行着 MongoDB,它具有以下详细信息:

database_name = 'data_analytics'
collection_name = 'TestDB'
server = 'localhost'
mongodb_port = 27017

我在本地计算机的 docker 外部运行以下代码来测试代码,通过创建/更新现有集合,它工作得非常好:

import pandas as pd
import os
import json
import pymongo
from pymongo import MongoClient
import os
import glob


def dataframe_cleaner(csv_path):
    df = pd.read_csv(csv_path)
    df.columns = df.columns.str.replace('[#,@,&,.]', '')
    df.columns = df.columns.str.replace(' ', '_')
    df.columns = [x.lower() for x in df.columns]
    return df


def mongo_loader(dataframe, db_name, collection_name, server, mongodb_port):
    client = MongoClient(server, int(mongodb_port))
    db = client[db_name]
    # print(db)

    records = json.loads(dataframe.T.to_json()).values()
    db.TestDB.insert(records)
    return True

csv_path = '/home/ubuntu/Downloads/test.csv'


database_name = 'data_analytics'
collection_name = 'TestDB'
server = 'localhost'
mongodb_port = 27017

df = dataframe_cleaner(csv_path)
criteria = mongo_loader(df, database_name, collection_name, server, mongodb_port)

根据建议here https://stackoverflow.com/questions/52314900/how-to-connect-local-mongo-database-to-docker/52318134,我已将 server = 'localhost' 更新为 server = 'host.docker.internal' 并在 docker 内运行相同的代码来读取 csv 文件并将数据推送到同一主机上 docker 外部的 MongoDB,但是无济于事,我仍然遇到同样的错误:

/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:30: DeprecationWarning: insert is deprecated. Use insert_one or insert_many instead.
---------------------------------------------------------------------------
ServerSelectionTimeoutError               Traceback (most recent call last)
<ipython-input-4-458f690221ff> in <module>
     41 
     42 df = dataframe_cleaner(csv_path)
---> 43 criteria = mongo_loader(df, database_name, collection_name, server, mongodb_port)
     44 
     45 #if criteria is True:

<ipython-input-4-458f690221ff> in mongo_loader(dataframe, db_name, collection_name, server, mongodb_port)
     28 
     29     records = json.loads(dataframe.T.to_json()).values()
---> 30     db.TestDB.insert(records)
     31     return True
     32 

/opt/conda/lib/python3.6/site-packages/pymongo/collection.py in insert(self, doc_or_docs, manipulate, check_keys, continue_on_error, **kwargs)
   3292             write_concern = WriteConcern(**kwargs)
   3293         return self._insert(doc_or_docs, not continue_on_error,
-> 3294                             check_keys, manipulate, write_concern)
   3295 
   3296     def update(self, spec, document, upsert=False, manipulate=False,

/opt/conda/lib/python3.6/site-packages/pymongo/collection.py in _insert(self, docs, ordered, check_keys, manipulate, write_concern, op_id, bypass_doc_val, session)
    647         blk.ops = [(message._INSERT, doc) for doc in gen()]
    648         try:
--> 649             blk.execute(write_concern, session=session)
    650         except BulkWriteError as bwe:
    651             _raise_last_error(bwe.details)

/opt/conda/lib/python3.6/site-packages/pymongo/bulk.py in execute(self, write_concern, session)
    526                 self.execute_no_results(sock_info, generator)
    527         else:
--> 528             return self.execute_command(generator, write_concern, session)
    529 
    530 

/opt/conda/lib/python3.6/site-packages/pymongo/bulk.py in execute_command(self, generator, write_concern, session)
    356 
    357         client = self.collection.database.client
--> 358         with client._tmp_session(session) as s:
    359             client._retry_with_session(
    360                 self.is_retryable, retryable_bulk, s, self)

/opt/conda/lib/python3.6/contextlib.py in __enter__(self)
     79     def __enter__(self):
     80         try:
---> 81             return next(self.gen)
     82         except StopIteration:
     83             raise RuntimeError("generator didn't yield") from None

/opt/conda/lib/python3.6/site-packages/pymongo/mongo_client.py in _tmp_session(self, session, close)
   1827             return
   1828 
-> 1829         s = self._ensure_session(session)
   1830         if s:
   1831             try:

/opt/conda/lib/python3.6/site-packages/pymongo/mongo_client.py in _ensure_session(self, session)
   1814             # Don't make implicit sessions causally consistent. Applications
   1815             # should always opt-in.
-> 1816             return self.__start_session(True, causal_consistency=False)
   1817         except (ConfigurationError, InvalidOperation):
   1818             # Sessions not supported, or multiple users authenticated.

/opt/conda/lib/python3.6/site-packages/pymongo/mongo_client.py in __start_session(self, implicit, **kwargs)
   1764 
   1765         # Raises ConfigurationError if sessions are not supported.
-> 1766         server_session = self._get_server_session()
   1767         opts = client_session.SessionOptions(**kwargs)
   1768         return client_session.ClientSession(

/opt/conda/lib/python3.6/site-packages/pymongo/mongo_client.py in _get_server_session(self)
   1800     def _get_server_session(self):
   1801         """Internal: start or resume a _ServerSession."""
-> 1802         return self._topology.get_server_session()
   1803 
   1804     def _return_server_session(self, server_session, lock):

/opt/conda/lib/python3.6/site-packages/pymongo/topology.py in get_server_session(self)
    486                             any_server_selector,
    487                             self._settings.server_selection_timeout,
--> 488                             None)
    489                 elif not self._description.readable_servers:
    490                     self._select_servers_loop(

/opt/conda/lib/python3.6/site-packages/pymongo/topology.py in _select_servers_loop(self, selector, timeout, address)
    215                 raise ServerSelectionTimeoutError(
    216                     "%s, Timeout: %ss, Topology Description: %r" %
--> 217                     (self._error_message(selector), timeout, self.description))
    218 
    219             self._ensure_opened()

ServerSelectionTimeoutError: host.docker.internal:27017: [Errno -2] Name or service not known, Timeout: 30s, Topology Description: <TopologyDescription id: 601a3b8e6563d1163530d9c1, topology_type: Single, servers: [<ServerDescription ('host.docker.internal', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('host.docker.internal:27017: [Errno -2] Name or service not known',)>]>

请帮忙!


Aakash,我不清楚 MongoDB 服务器是否作为 docker 容器运行,或者它是否是 docker 主机上的标准应用程序。

Docker 运行多个网络,可能使用不同的驱动程序,因此您必须将 pytorch 连接到可以访问 MongoDB 实例网络的网络。

如果 MongoDB 作为应用程序在主机上运行, add a --network="host"标记为你的 pytorch 命令。

docker run -it -v /home/ubuntu/Downloads/docker_work/test_py_app/app:/workspace/app -p 8881:8888 -p 5002:5002 --gpus all --network="host" --rm nvcr.io/nvidia/pytorch:20.08-py3

这将指示 docker 将 pytorch 绑定到真实的网络接口,并通过以下方式访问 mongo:localhost: 27017

如果 MongoDB 作为 docker 容器运行,请确保在运行它时将其端口映射到外部世界,或者如果您在与其相同的虚拟网络上运行 pytorch。

要简单地公开端口,请确保-p 27017:27017docker run 命令中存在该标志。

要使用相同的虚拟网络,请检查Networks键的输出docker inspect MONGO_CONTAINER_ID命令并添加相同的名称--network="name"在你的 pytorch 执行中。

欲了解更多信息,请查看docker 网络手册 https://docs.docker.com/network/.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何从同一台机器上运行的 Docker 加载主机中运行的 MongoDB 中的数据? 的相关文章

随机推荐