Building a Genetic Algorithm for Neural Networks and Neural Networks for Graphical Games Using Python and NumPy

Building a Genetic Algorithm for Neural Networks and Neural Networks for Graphical Games Using Python and NumPy

Hello, Habre!

Today I will tell and show how to make a Genetic Algorithm(GA) for a neural network so that it can play different games. I tested it on Pong and Flappy bird. He showed himself very well. I recommend reading it if you haven’t read the first article: “Creating a Simple and Workable Genetic Algorithm for a Neural Network with Python and NumPy”, because I’ve refined my code that would be shown in this article.

I split the code into two scripts, in one the neural network plays some game, in the other it learns and makes decisions (the genetic algorithm itself). The code with the game is a function that returns a fitness function (it is needed to sort neural networks, for example, how long it lasted, how many points it earned, etc.). Therefore, the code with the games (two of them) will be at the end of the article. The genetic algorithm for the neural network for the game Pong and the game Flappy Bird differ only in parameters.

Using the script I wrote and described in the previous article, I created a heavily modified genetic algorithm code for the game Pong, which I will describe the most because it is what I relied on when I already created the GA for Flappy Bird.

First we need to import modules, lists and variables:

import numpy as np
import random
import ANNPong as anp
import pygame as pg
import sys
from pygame.locals import *
pg.init()
listNet = {}
NewNet = []
goodNet = []
timeNN = 0
moveRight = False
moveLeft = False
epoch = 0
mainClock = pg.time.Clock()
WINDOWWIDTH = 800
WINDOWHEIGHT = 500
windowSurface = pg.display.set_mode((WINDOWWIDTH, WINDOWHEIGHT), 0, 32)
pg.display.set_caption('ANN Pong')

AnnPong is a script with a game

listNet, NewNet, goodNet – lists of neural networks (later we will analyze in more detail)

timeNN – fitness function

MoveRight, moveLeft – selection of the neural network where to move

epoch – epoch counter

def sigmoid(x):
    return 1/(1 + np.exp(-x))

class Network():
    def __init__(self):
        self.H1 = np.random.randn(6, 12)
        self.H2 = np.random.randn(12, 6)
        self.O1 = np.random.randn(6, 3)
        self.BH1 = np.random.randn(12)
        self.BH2 = np.random.randn(6)
        self.BO1 = np.random.randn(3)
        self.epoch = 0

    def predict(self, x, first, second):
        nas = x @ self.H1 + self.BH1
        nas = sigmoid(nas)
        nas = nas @ self.H2 + self.BH2
        nas = sigmoid(nas)
        nas = nas @ self.O1  + self.BO1
        nas = sigmoid(nas)
        if nas[0] > nas[1] and nas[0] > nas[2]:
            first = True
            second = False
            return first, second
        elif nas[1] > nas[0] and nas[1] > nas[2]:
            first = False
            second = True
            return first, second
        elif nas[2] > nas[0] and nas[2] > nas[1]:
            first = False
            second = False
            return first, second
        else:
            first = False
            second = False
            return first, second
        def epoch(self, a):
            return 0
            

class Network1():
    def __init__(self, H1, H2, O1, BH1, BH2, BO1, ep):
        self.H1 = H1
        self.H2 = H2
        self.O1 = O1
        self.BH1 = BH1
        self.BH2 = BH2
        self.BO1 = BO1
        self.epoch = ep

    def predict(self, x, first, second):
        nas = x @ self.H1 + self.BH1
        nas = sigmoid(nas)
        nas = nas @ self.H2 + self.BH2
        nas = sigmoid(nas)
        nas = nas @ self.O1 + self.BO1
        nas = sigmoid(nas)
        if nas[0] > nas[1] and nas[0] > nas[2]:
            first = True
            second = False
            return first, second
        elif nas[1] > nas[0] and nas[1] > nas[2]:
            first = False
            second = True
            return first, second
        elif nas[2] > nas[0] and nas[2] > nas[1]:
            first = False
            second = False
            return first, second
        else:
            first = False
            second = False
            return first, second

The sigmoid is used as the activation function.

In the Network class, we define the parameters of the neural network, and it tells the predict function where to move in the game. (nas – short for Network answer), the epoch function returns the epoch of the appearance of this AI for generation zero, since a separate variable is set in the Network1() class.

for s in range (1000):
    s = Network()
    timeNN = anp.NNPong(s)
    listNet.update({
        s : timeNN
    })
    
listNet = dict(sorted(listNet.items(), key=lambda item: item[1]))
NewNet = listNet.keys()
goodNet = list(NewNet)
NewNet = goodNet[:10]
listNet = {}
goodNet = NewNet
anp.NPong(NewNet[0])
print(str(epoch) + " epoch")
print(NewNet[0].epoch)
print('next')
anp.NPong(NewNet[1])
print(NewNet[1].epoch)
print('next')
anp.NPong(NewNet[2])
print(NewNet[2].epoch)
print('next')
anp.NPong(NewNet[3])
print(NewNet[3].epoch)
print('next')
anp.NPong(NewNet[4])
print(NewNet[4].epoch)
print('next')
anp.NPong(NewNet[5])
print(NewNet[5].epoch)
print('next')
anp.NPong(NewNet[6])
print(NewNet[6].epoch)
print('next')
anp.NPong(NewNet[7])
print(NewNet[7].epoch)
print('next')
anp.NPong(NewNet[8])
print(NewNet[8].epoch)
print('next')
anp.NPong(NewNet[9])
print(NewNet[9].epoch)
print('that is all')

Here we run the neural networks with randomly generated weights and select the 10 worst ones so that the entire work of their education is taken over by a genetic algorithm))) and we show them.

More details:

The fitness function returned from the code with the game is recorded in timeNN, then we add AI and its timeNN value to listNet. After the loop, we sort the list, write neural networks from listNet to NewNet, then form the list and leave only ten.

for g in range(990):
    parent1 = random.choice(NewNet)
    parent2 = random.choice(NewNet)
    ch1H = np.vstack((parent1.H1[:3], parent2.H1[3:])) * random.uniform(-2, 2)
    ch2H = np.vstack((parent1.H2[:6], parent2.H2[6:])) * random.uniform(-2, 2)
    ch1O = np.vstack((parent1. O1[:3], parent2. O1[3:])) * random.uniform(-2, 2)
    chB1 = parent1.BH1 * random.uniform(-2, 2)
    chB2 = parent2.BH2 * random.uniform(-2, 2)
    chB3 = parent2.BO1 * random.uniform(-2, 2)
    g = Network1(ch1H, ch2H, ch1O, chB1, chB2, chB3, 1)
    goodNet.append(g)
NewNet = []

This is where crossover and mutation take place. (Such points were described in more detail in the first article)

while True:
    epoch += 1
    print(str(epoch) + " epoch")
    for s in goodNet:
        timeNN = anp.NNPong(s)
        listNet.update({
            s : timeNN
        })
    goodNet =[]
    listNet = dict(sorted(listNet.items(), key=lambda item: item[1], reverse=True))
    goodNet = list(listNet.keys())
    NewNet.append(goodNet[0])
    goodNet = list(listNet.values())
    for i in listNet:
        a = goodNet[0]
        if listNet.get(i) == a:
            NewNet.append(i)
    goodNet = list(NewNet)
    listNet = {}
    try:
        print(NewNet[0].epoch)
        anp.NPong(NewNet[0])
        print('next')
        print(NewNet[1].epoch)
        anp.NPong(NewNet[1])
        print('next')
        print(NewNet[2].epoch)
        anp.NPong(NewNet[2])
        print('next')
        print(NewNet[3].epoch)
        anp.NPong(NewNet[3])
        print('next')
        print(NewNet[4].epoch)
        anp.NPong(NewNet[4])
        print('next')
            
        print(NewNet[5].epoch)
        anp.NPong(NewNet[5])
        print('next')
        print(NewNet[6].epoch)
        anp.NPong(NewNet[6])
        print('next')
        print(NewNet[7].epoch)
        anp.NPong(NewNet[7])
        print('next')
    except IndexError:
        print('that is all')
        
    for g in range(1000 - len(NewNet)):
        parent1 = random.choice(NewNet)
        parent2 = random.choice(NewNet)
        ch1H = np.vstack((parent1.H1[:3], parent2.H1[3:])) * random.uniform(-2, 2)
        ch2H = np.vstack((parent1.H2[:6], parent2.H2[6:])) * random.uniform(-2, 2)
        ch1O = np.vstack((parent1. O1[:3], parent2. O1[3:])) * random.uniform(-2, 2)
        chB1 = parent1.BH1 * random.uniform(-2, 2)
        chB2 = parent2.BH2 * random.uniform(-2, 2)
        chB3 = parent2.BO1 * random.uniform(-2, 2)
        g = Network1(ch1H, ch2H, ch1O, chB1, chB2, chB3, epoch)
        goodNet.append(g)
    print(len(NewNet))
    print(len(goodNet))
    NewNet = []

This has already been repeated, so I will explain only what was not said before:

Here we take the first on the list, i.e. one of the best in the epic and compare its results with the rest, since very often there are several AIs that have achieved the same success. And these equal leaders will participate in mutations, we use the try method, because the best in this epoch may be less than 10. And we also throw these neural networks into the next epoch without changes, because the descendants may turn out to be worse than their ancestors, that is, so that they do not degraded

That’s all for the first code!

Let’s go to the game code. Here I will explain only what is related to AI training (I will post everything with a link to the disk).

In the Pong game, the neural network played twice: the first time the ball bounces to the left, the second time – to the right

*whGo is a variable code (short for “where to go”)

We return time as a fitness function. The game has two almost identical functions, but in the second one we show everything on the screen, it is necessary for us to see the progress after each epoch and when the neural network has passed the game, we determine if it survived in the first one for more than 8000 thousand updates.

After months of work and tweaks, I managed to create a learning algorithm for the game Pong, but to be sure, I decided to test the AI ​​not on my game, but on another person’s (omnivorousness test)))), I chose Flappy Bird on pygame from this video: https://youtu.be/7IqrZb0Sotw?feature=shared

After slightly changing the game for the neural network, for example, I added variable distances from the bird to the pipe. They are 3 by 3 because we need to know the height of each pipe (y) and the distance in x, and there were no more than three pairs of pipes on the screen, so three by three (nine in total). Also, after a collision, the function was restarted, and the third parameter, which was called rep, was used to tell the function which restart it was, if it was equal to three, then the game returned the fitness function in the Genetic Algorithm, and if it was zero, then we assigned the variable time the value 0. Also, I did not write two very functions are similar to each other, but just checked if the checkNN variable is True, you need to refresh the screen.

I also refined the training code

while True:
    for event in pg.event.get():
        if event.type == KEYDOWN:
            if event.key == K_1:
                showNN = True
    epoch += 1
    print(str(epoch) + " epoch")
    if epoch < 10:
        for s in goodNet:
            timeNN = anp.NPong(s, False, 0, 0)
            listNet.update({
                s : timeNN
            })
    if epoch >= 10:
        for s in goodNet:
            timeNN = anp.NPong(s, False, 0, 1)
            listNet.update({
                s : timeNN
            })

After the tenth epoch, due to the last parameter, which we change to one (in the game code, I called this parameter varRe from the words variant of return), the game returns not the time, but the number of pipes before the collision (this is how the neural network learns better)

 howALot = 1000 - len(NewNet)
    if howALot < 40:
        howALot = 40

These three lines of code are needed if in the previous epoch of AI with the same result there were very, very many and the algorithm can stop learning because it will have nothing to learn :-).

That’s all, if you have any questions, write in the comments, for now!

https://drive.google.com/drive/folders/1OlUYoUV3oBaTEvfPkeAEG_Exmv8rj_yA?usp=sharing – Pong Network

https://drive.google.com/drive/folders/13Ca8u0fxOlZbQaz2Nj606gYvLpnT316i?usp=share_link – Flappy bird Network

Related posts