我正在使用 RL 编写自动驾驶代码。我正在使用稳定的基线3和开放的人工智能健身房环境。我在 jupyter 笔记本中运行以下代码,但出现以下错误:
# Testing our model
episodes = 5 # test the environment 5 times
for episodes in range(1,episodes+1): # looping through each episodes
bs = env.reset() # observation space
# Taking the obs and passing it through our model
# tells that which kind of the action is best for our work
done = False
score = 0
while not done:
env.render()
action, _ = model.predict(obs) # now using model here # returns model action and next
state
# take that action to get the best reward
# for observation space we get the box environment
# rather than getting random action we are using model.predict(obs) on our obs for an
curr env to gen the action inorder to get best possible reward
obs, reward, done, info = env.step(action) # gies state, reward whose value is 1
# reward is 1 for every step including the termination step
score += reward
print('Episode:{},Score:{}'.format(episodes,score))'''
env.close()
Error
我编写的代码的链接如下:https://drive.google.com/file/d/1JBVmPLn-N1GCl_Rgb6-qGMpJyWvBaR1N/view?usp=sharing https://drive.google.com/file/d/1JBVmPLn-N1GCl_Rgb6-qGMpJyWvBaR1N/view?usp=sharing
我使用的python版本是Anaconda环境中的Python 3.8.13。
我使用的是 Pytorch CPU 版本,操作系统是 Windows 10。
请帮我解决这个问题。
Using .copy()
对于 numpy 数组应该有帮助(因为PyTorch 张量无法处理负步幅 https://discuss.pytorch.org/t/negative-strides-in-tensor-error/134287/2):
action, _ = model.predict(obs.copy())
由于依赖性问题,我无法快速运行你的笔记本,但我在 AI2THOR 模拟器上遇到了同样的错误,并添加.copy()
有帮助。
也许有更多技术知识的人numpy
, torch
或者 AI2THOR 会更详细地解释错误发生的原因。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)