仅记录,转自https://www.coder.work/article/4985246
import pycuda.autoinit # Create CUDA context
import pycuda.driver as cuda
Main thread
with open(“sample.engine”, “rb”) as f, trt.Runtime(TRT_LOGGER) as runtime:
engine = runtime.deserialize_cuda_engine(f.read())
…
Worker thread
with engine.create_execution_context() as context:
inputs, outputs, bindings, stream = common.allocate_buffers(engine)
common.do_inference(context, inputs, outputs, bindings, stream)
上面的代码产生以下错误:
pycuda._driver.LogicError: explicit_context_dependent failed: invalid device context - no currently active context?
这听起来像工作线程中没有活动的CUDA上下文。因此,我尝试在辅助线程中手动创建CUDA上下文:
Worker thread
from pycuda.tools import make_default_context()
cuda.init() # Initialize CUDA
ctx = make_default_context() # Create CUDA context
with engine.create_execution_context() as context:
inputs, outputs, bindings, stream = common.allocate_buffers(engine)
common.do_inference(context, inputs, outputs, bindings, stream)
ctx.pop() # Clean up
如果要在多线程中进行推理,则需要按以下方式修改common.py。在触发GPU任务之前创建上下文:
dev = cuda.Device(0) // 0 is your GPU number
ctx = dev.make_context()
在执行GPU任务后清理:
ctx.pop()
del ctx