如何在OpenAI中创建新的健身房环境？

2023-12-23

我的任务是制作一个 AI 代理，该代理将学习使用 ML 玩视频游戏。我想使用 OpenAI Gym 创建一个新环境，因为我不想使用现有环境。如何创建新的自定义环境？

另外，我是否可以通过其他方式开始开发 AI 代理来玩特定的视频游戏，而无需 OpenAI Gym 的帮助？

See my banana-gym https://github.com/MartinThoma/banana-gym适用于极小的环境。

创建新环境

查看存储库的主页：

https://github.com/openai/gym/blob/master/docs/creating_environments.md https://github.com/openai/gym/blob/master/docs/creating_environments.md

步骤是：

创建具有 PIP 包结构的新存储库

它应该看起来像这样

gym-foo/
  README.md
  setup.py
  gym_foo/
    __init__.py
    envs/
      __init__.py
      foo_env.py
      foo_extrahard_env.py

有关其内容，请点击上面的链接。没有提到的细节特别是一些功能如何foo_env.py应该看起来像。查看示例并查看gym.openai.com/docs/ https://gym.openai.com/docs/有帮助。这是一个例子：

class FooEnv(gym.Env):
    metadata = {'render.modes': ['human']}

    def __init__(self):
        pass

    def _step(self, action):
        """

        Parameters
        ----------
        action :

        Returns
        -------
        ob, reward, episode_over, info : tuple
            ob (object) :
                an environment-specific object representing your observation of
                the environment.
            reward (float) :
                amount of reward achieved by the previous action. The scale
                varies between environments, but the goal is always to increase
                your total reward.
            episode_over (bool) :
                whether it's time to reset the environment again. Most (but not
                all) tasks are divided up into well-defined episodes, and done
                being True indicates the episode has terminated. (For example,
                perhaps the pole tipped too far, or you lost your last life.)
            info (dict) :
                 diagnostic information useful for debugging. It can sometimes
                 be useful for learning (for example, it might contain the raw
                 probabilities behind the environment's last state change).
                 However, official evaluations of your agent are not allowed to
                 use this for learning.
        """
        self._take_action(action)
        self.status = self.env.step()
        reward = self._get_reward()
        ob = self.env.getState()
        episode_over = self.status != hfo_py.IN_GAME
        return ob, reward, episode_over, {}

    def _reset(self):
        pass

    def _render(self, mode='human', close=False):
        pass

    def _take_action(self, action):
        pass

    def _get_reward(self):
        """ Reward is given for XY. """
        if self.status == FOOBAR:
            return 1
        elif self.status == ABC:
            return self.somestate ** 2
        else:
            return 0

使用您的环境

import gym
import gym_foo
env = gym.make('MyEnv-v0')

Examples

https://github.com/openai/gym-soccer https://github.com/openai/gym-soccer
https://github.com/openai/gym-wikinav https://github.com/openai/gym-wikinav
https://github.com/alibaba/gym-starcraft https://github.com/alibaba/gym-starcraft
https://github.com/endgameinc/gym-malware https://github.com/endgameinc/gym-malware
https://github.com/hackthemarket/gym-trading https://github.com/hackthemarket/gym-trading
https://github.com/tambetm/gym-minecraft https://github.com/tambetm/gym-minecraft
https://github.com/ppaquette/gym-doom https://github.com/ppaquette/gym-doom
https://github.com/ppaquette/gym-super-mario https://github.com/ppaquette/gym-super-mario
https://github.com/tuzzer/gym-maze https://github.com/tuzzer/gym-maze

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

machinelearning

artificialintelligence

openaigym