肿瘤康复网 > 音速索尼克怪人_如何使用AI玩刺猬索尼克。真干净！

音速索尼克怪人_如何使用AI玩刺猬索尼克。真干净！

时间：2020-04-04 00:17:36

相关推荐

音速索尼克怪人

by Vedant Gupta

由Vedant Gupta

如何使用AI玩刺猬索尼克。真干净！ (How to use AI to play Sonic the Hedgehog. It’s NEAT!)

Generation after generation, humans have adapted to become more fit with our surroundings. We started off as primates living in a world of eat or be eaten. Eventually we evolved into who we are today, reflecting modern society. Through the process of evolution we become smarter. We are able to work better with our environment and accomplish what we need to.

世代相传，人类已经适应了我们周围的环境。我们从生活在被吃或被吃的世界中的灵长类动物开始。最终，我们发展成为今天的我们，反映了现代社会。通过发展，我们变得更加聪明。我们能够更好地与环境合作并完成我们所需的工作。

The concept of learning through evolution can also be applied to Artificial Intelligence. We can train AIs to perform certain tasks using NEAT, Neuroevolution of Augmented Topologies. Simply put, NEAT is an algorithm which takes a batch of AIs (genomes) attempting to accomplish a given task. The top performing AIs “breed” to create the next generation. This process continues until we have a generation which is capable of completing what it needs to.

通过进化学习的概念也可以应用于人工智能。我们可以使用NEAT(增强拓扑的神经进化)训练AI执行某些任务。简而言之，NEAT是一种算法，它需要一批AI(基因组)来尝试完成给定的任务。表现最佳的AI“繁殖”以创造下一代。这个过程一直持续到我们有足够的一代可以完成所需的工作为止。

NEAT is amazing because it eliminates the need for pre-existing data required to train our AIs. Using the power of NEAT and OpenAI’s Gym Retro I trained an AI to play Sonic the Hedgehog for the SEGA Genesis. Let’s learn how!

NEAT令人惊叹，因为它消除了训练我们的AI所需的预先存在的数据的需求。利用NEAT和OpenAI的Gym Retro的力量，我训练了AI来扮演SEGA Genesis的刺猬索尼克。让我们学习如何！

NEAT神经网络(Python实现) (A NEAT Neural Network (Python Implementation))

GitHub储存库 (GitHub Repository)

Vedant-Gupta523/sonicNEATContribute to Vedant-Gupta523/sonicNEAT development by creating an account on GitHub.

Vedant-Gupta523 / sonicNEAT通过在GitHub上创建一个帐户来为Vedant-Gupta523 / sonicNEAT开发做出贡献。

了解OpenAI Gym (Understanding OpenAI Gym)

If you are not already familiar with OpenAI Gym, look through the terminology below. They will be used frequently throughout the article.

如果您还不熟悉OpenAI Gym，请仔细阅读以下术语。在整个文章中将经常使用它们。

agent —The AI player. In this case it will be Sonic.

agent —AI播放器。在这种情况下，它将是Sonic。

environment —The complete surroundings of the agent. The game environment.

环境-代理程序的完整环境。游戏环境。

action —Something the agent has the option of doing (i.e. move left, move right, jump, do nothing).

动作—代理可以选择执行的操作(即，向左移动，向右移动，跳转，不执行任何操作)。

step —Performing 1 action.

步骤—执行1个动作。

state —A frame of the environment. The current situation the AI is in.

状态—环境的框架。 AI的当前状况。

observation —What the AI observes from the environment.

观察—AI从环境中观察到的内容。

fitness —How well our AI is performing.

健身—我们的AI表现如何。

done —When the AI has completed its task or can’t continue any further.

已完成—当AI已完成其任务或无法继续进行下去时。

安装依赖项 (Installing Dependencies)

Below are GitHub links for OpenAI and NEAT with installation instructions.

以下是带有安装说明的OpenAI和NEAT的GitHub链接。

OpenAI: /openai/retro

OpenAI的： https ： ///openai/retro

NEAT:/CodeReclaimers/neat-python

整洁： https : ///CodeReclaimers/neat-python

Pip installlibraries such as cv2, numpy, pickle etc.

Pip安装库，例如cv2，numpy，pickle等。

导入库并设置环境 (Import libraries and set environment)

To start, we need to import all of the modules we will use:

首先，我们需要导入将要使用的所有模块：

import retroimport numpy as npimport cv2import neatimport pickle

We will also define our environment, consisting of the game and the state:

我们还将定义环境，包括游戏和状态：

env = retro.make(game = "SonicTheHedgehog-Genesis", state = "GreenHillZone.Act1")

In order to train an AI to play Sonic the Hedgehog, you will need the game’s ROM (game file). The simplest way to get it is by purchasing the game off of Steam for $5. You could also find free find downloads of the ROM online, however it is illegal, so don’t do this.

为了训练AI玩刺猬索尼克，您需要游戏的ROM(游戏文件)。最简单的获取方法是以5美元的价格从Steam上购买游戏。您也可以在线免费找到ROM的下载内容，但这是非法的，因此请勿这样做。

In the OpenAI repository atretro/retro/data/stable/you will find a folder for Sonic the Hedgehog Genesis. Place the game’s ROM here and make sure it is called rom.md. This folder also contains .state files. You can choose one and set the state parameter equal to it. I chose GreenHillZone Act 1 since it is the very first level of the game.

在Retro / retro / data / stable /的OpenAI存储库中，您将找到Sonic the Hedgehog Genesis的文件夹。将游戏的ROM放在此处，并确保其名称为rom.md。此文件夹还包含.state文件。您可以选择一个并设置等于它的状态参数。我选择了GreenHillZone Act 1，因为它是游戏的第一级。

了解data.json和scenario.json (Understanding data.json and scenario.json)

In the Sonic the Hedgehog folder you will have these two files:

在Sonic the Hedgehog文件夹中，您将拥有以下两个文件：

data.json

{ "info": { "act": {"address": 16776721,"type": "|u1" }, "level_end_bonus": {"address": 16775126,"type": "|u1" }, "lives": {"address": 16776722,"type": "|u1" }, "rings": {"address": 16776736,"type": ">u2" }, "score": {"address": 16776742,"type": ">u4" }, "screen_x": {"address": 16774912,"type": ">u2" }, "screen_x_end": {"address": 16774954,"type": ">u2" }, "screen_y": {"address": 16774916,"type": ">u2" }, "x": {"address": 16764936,"type": ">i2" }, "y": {"address": 16764940,"type": ">u2" }, "zone": {"address": 16776720,"type": "|u1" } }}

scenario.json

方案.json

{ "done": { "variables": {"lives": { "op": "zero"} } }, "reward": { "variables": {"x": { "reward": 10.0} } }}

Both these files contain important information pertaining to the game and its training.

这两个文件都包含与游戏及其训练有关的重要信息。

As it sounds, the data.json file contains information/data on different game specific variables (i.e. Sonic’s x-position, number of lives he has, etc.).

听起来，data.json文件包含有关不同游戏特定变量(例如Sonic的x位置，他的生命数量等)的信息/数据。

The scenario.json file allows us to perform actions in sync with the values of the data variables. For example we can reward Sonic 10.0 every time his x-position increases. We could also set our done condition to true when Sonic’s lives hit 0.

censing.json文件使我们能够与数据变量的值同步执行操作。例如，每当Sonic 10.0的x位置增加时，我们就可以对其进行奖励。当Sonic的生命达到0时，我们也可以将完成条件设置为true。

了解NEAT前馈配置 (Understanding NEAT feedforward configuration)

The config-feedforward file can be found in my GitHub repository linked above. It acts like a settings menu to set up our training. To point out a few simple settings:

可以在上面链接的GitHub存储库中找到config-feedforward文件。它就像设置菜单一样设置我们的培训。要指出一些简单的设置：

fitness_threshold= 10000 # How fit we want Sonic to becomepop_size = 20 # How many Sonics per generationnum_inputs = 1120 # Number of inputs into our modelnum_outputs = 12 # 12 buttons on Genesis controller

There are tons of settings you can experiment with to see how it effects your AI’s training! To learn more about NEAT and the different settings in the feedfoward configuration, I would highly recommend reading the documentation here

您可以尝试大量设置，以了解它如何影响AI的训练！要了解有关NEAT和前馈配置中不同设置的更多信息，强烈建议阅读此处的文档

放在一起：创建培训文件 (Putting it all together: Creating the Training File)

Setting up configuration

设置配置

Our feedforward configuration is defined and stored in the variable config.

我们的前馈配置已定义并存储在变量config中。

config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction, neat.DefaultSpeciesSet, neat.DefaultStagnation, 'config-feedforward')

Creating a function to evaluate each genome

创建一个函数来评估每个基因组

We start by creating the function, eval_genomes, which will evaluate our genomes (a genome could be compared to 1 Sonic in a population of Sonics). For each genome we reset the environment and take a random action

我们首先创建函数eval_genomes，该函数将评估我们的基因组(一个基因组可以与Sonics群体中的1 Sonic进行比较)。对于每个基因组，我们重置环境并采取随机行动

for genome_id, genome in genomes: ob = env.reset() ac = env.action_space.sample()

We will also record the game environment’s length and width and color. We divide the length and width by 8.

我们还将记录游戏环境的长度，宽度和颜色。我们将长度和宽度除以8。

inx, iny, inc = env.observation_space.shapeinx = int(inx/8)iny = int(iny/8)

We create a recurrent neural network (RNN) using the NEAT library and input the genome and our chosen configuration.

我们使用NEAT库创建一个递归神经网络 (RNN)，然后输入基因组和我们选择的配置。

net = neat.nn.recurrent.RecurrentNetwork.create(genome, config)

Finally, we define a few variables: current_max_fitness (the highest fitness in the current population), fitness_current (the current fitness of the genome), frame (the frame count), counter (to count the number of steps our agent takes), xpos (the x-position of Sonic), and done (whether or not we have reached our fitness goal).

最后，我们定义一些变量：current_max_fitness(当前总体中的最高适应度)，fitness_current(基因组的当前适应度)，frame(帧计数)，counter(计数我们的代理采取的步骤数)，xpos (Sonic的x位置)和完成(无论我们是否达到健身目标)。

current_max_fitness = 0fitness_current = 0frame = 0counter = 0xpos = 0done = False

While we have not reached our done requirement, we need to run the environment, increment our frame counter, and shape our observation to mimic that of the game (still for each genome).

当我们尚未达到已完成的要求时，我们需要运行环境，增加帧计数器，并调整我们的观察结果以模仿游戏的观察结果(仍然针对每个基因组)。

env.render()frame += 1ob = cv2.resize(ob, (inx, iny))ob = cv2.cvtColor(ob, cv2.COLOR_BGR2GRAY)ob = np.reshape(ob, (inx,iny))

We will take our observation and put it in a one-dimensional array, so that our RNN can understand it. We receive our output by feeding this array to our RNN.

我们将观察到的结果放入一维数组中，以便我们的RNN能够理解它。我们通过将该数组馈入RNN来接收输出。

imgarray = []imgarray = np.ndarray.flatten(ob)nnOutput = net.activate(imgarray)

Using the output from the RNN our AI takes a step. From this step we can extract fresh information: a new observation, a reward, whether or not we have reached our done requirement, and information on variables in our data.json (info).

利用RNN的输出，我们的AI迈出了一步。从这一步中，我们可以提取新信息：新的观察结果，奖励，是否达到了已完成的要求以及data.json(信息)中有关变量的信息。

ob, rew, done, info = env.step(nnOutput)

At this point we need to evaluate our genome’s fitness and whether or not it has met the done requirement.

在这一点上，我们需要评估我们的基因组适应性，以及它是否满足要求。

We look at our “x” variable from data.json and check if it has surpassed the length of the level. If it has, we will increase our fitness by our fitness threshold signifying we are done.

我们查看data.json中的“ x”变量，并检查其是否超过了级别的长度。如果有的话，我们将通过适合度阈值表示我们已完成，从而增加适合度。

xpos = info['x'] if xpos >= 10000: fitness_current += 10000 done = True

Otherwise, we will increase our current fitness by the reward we earned from performing the step. We also check if we have a new highest fitness and adjust the value of our current_max_fitness accordingly.

否则，我们将通过执行此步骤所获得的奖励来增加当前的适应性。我们还将检查是否具有新的最高适应度，并相应地调整current_max_fitness的值。

fitness_current += rew

if fitness_current > current_max_fitness: current_max_fitness = fitness_current counter = 0else: counter += 1

Lastly, we check if we are done or if our genome has taken 250 steps. If so, we print information on the genome which was simulated. Otherwise we keep looping until one of the two requirements has been satisfied.

最后，我们检查是否完成或基因组已采取250步。如果是这样，我们在模拟的基因组上打印信息。否则，我们将继续循环直到满足两个要求之一。

if done or counter == 250: done = True print(genome_id, fitness_current)genome.fitness = fitness_current

Defining the population, printing training stats, and more

定义人口，打印培训统计信息等

The absolute last thing we need to do is define our population, print out statistics from our training, save checkpoints (in case you want to pause and resume training), and pickle our winning genome.

我们要做的绝对的最后一件事是定义我们的种群，打印训练中的统计数据，保存检查点(以防您想暂停和继续训练)以及腌制我们获胜的基因组。

p = neat.Population(config)

p.add_reporter(neat.StdOutReporter(True))stats = neat.StatisticsReporter()p.add_reporter(stats)p.add_reporter(neat.Checkpointer(1))

winner = p.run(eval_genomes)

with open('winner.pkl', 'wb') as output: pickle.dump(winner, output, 1)

All that’s left is the matter of running the program and watching Sonic slowly learn how to beat the level!

剩下的就是运行程序并看着Sonic慢慢学习如何超越水平的问题！

To see all of the code put together check out the Training.py file in my GitHub repository.

要查看所有代码，请查看我的GitHub存储库中的Training.py文件。

奖励：平行训练 (Bonus: Parallel Training)

If you have a multi-core CPU you can run multiple training simulations at once, exponentially increasing the rate at which you can train your AI! Although I will not go through the specifics on how to do this in this article, I highly suggest you check thesonicTraning.pyimplementation in my GitHub repository.

如果您有多核CPU，则可以一次运行多个训练仿真，从而成倍地提高训练AI的速度！尽管我不会在本文中详细介绍如何执行此操作，但我还是强烈建议您在我的GitHub存储库中检查sonicTraning.py实现。

结论 (Conclusion)

That’s all there is to it! With a few adjustments, this framework is applicable to any game for the NES, SNES, SEGA Genesis, and more. If you have any questions or you just want to say hello, feel free to email me at vedantgupta523[at]gmail[dot]com ?

这里的所有都是它的！经过一些调整，此框架适用于NES，SNES，SEGA Genesis等的任何游戏。如果您有任何疑问或想打个招呼，请随时通过vedantgupta523 [at] gmail [dot] com向我发送电子邮件？