A simple neural network in C++
Preface
Hello everyone!
This article is written as a reflection on the performance of laboratory work. Since the emphasis was on writing a working neural network, all the formulas given are nothing to expect. If you are interested in the mathematical apparatus, then I studied it according to this article.
I am waiting for all the correct remarks about the code in the comments.
Solved task
The problem to be solved sounds something like this: A 7×7 picture is submitted to the input, it is necessary to determine whether it is drawn – a circle, a square or a triangle.
We will leave the question of the displacement of the figure in relation to the picture behind the brackets, in our input data they will always be in the center. However, there may be noise in the images, which makes it difficult to solve the problem.
Theoretical component
First, let’s define the main notations.
– Output of neuron i of layer k
– the number of neurons in the layer with the number k – 1
– the expected value of the output i of the network
– Number of the last layer of the network
– the weight of the neuron in the corresponding connection, k – the number of the layer, i – the number of the neuron, j – the number of the input from the previous layer
For a better understanding of the markings of the outputs of neurons and scales, I will show a simple picture.
Each value of a neuron in the output layer corresponds to the probability that the corresponding shape is drawn in the image. Since we have 3 possible shapes, there will also be 3 output neurons. At the input, we have pictures with a size of 7 by 7, so there will be 49 inputs corresponding to each pixel of the image, respectively.
The logic of the network is very simple – we need to calculate the value of all neurons.
The values of neurons are considered according to the following formula:
Let me explain this formula – we take all the neurons in the previous level and multiply them by the corresponding weights that come from the neurons in the previous level and reach our neuron. Then we drive the received sum into the activation function, this is necessary so that the value of the neurons does not exceed 1. In our case, we will use the sigmoid as the activation function:
So we calculated the output values of the neurons by going through all the layers, but the outputs naturally did not match our expectation of the probability values. What to do? Adjust the weights so that the next time we run the same image, the network will give a slightly more accurate result.
To adjust the weights, we use the following formula:
Alpha is the learning coefficient, the closer the actual values are to the expected values, the less you need to do alpha.
As you may have noticed, there is also a new notation for delta, which is not yet known to us, its calculation is carried out according to the following rule:
For the source layer, the formula is:
For all other layers, the formula is based on the values of the previous layer:
In general, here the algorithm for counting deltas is about the same as for counting neurons, only with different formulas and vice versa.
Practical component
Finally, let’s get down to writing the code!
We will store all the functionality of the neural network in the NeuralNet class. To begin with, we will write its interface, and then we will delve into the implementation.
class NeuralNet
{
public:
void set_input(vector> input); // Передаем в сеть картинку
void set_expected(vector input); // Передаем в сеть ожидаемые значения выхода
void train(void); // Запуск итерации обучения
size_t apply(void); // Запуск подсчёта сети без обучения (для валидации результатов)
private:
double activation(double x); // Функция активации
void count_neural_net(void); // Подсчёт значений нейронов
void clear_neural_net(void); // Обнуление значений нейронов
void recal_alpha(void); // Обновляем коэф обучения в зависимости от ошибки
void adj_weight(void); // Подстройка весов
size_t output_size; // Количество выходных нейронов = 3
size_t input_size; // Размер стороны входной картинки = 7
size_t neuron_size; // Количество нейронов в промежуточных слоях
size_t layer_count; // Количество промежуточных слоёв
double alpha; // Коэффициент обучения
vector expected; // Для хранения ожидаемых значений
vector> layers; // Слои нейронов
vector> delta; // Значения дельта для соответствующих слоёв
vector>> weight; // Веса сети
};
Let’s start with the simple and move to the more complex, let’s write functions for training and validation:
void NeuralNet::train(void)
{
clear_neural_net(); // Обнуляем веса
count_neural_net(); // Счиатем веса
recal_alpha(); // В зависимости от результатов подстраиваем коэф обучение
if (err() > MAX_ERR) // Если ошибка достаточно большая, то подстраиваем веса
adj_weight();
}
size_t NeuralNet::apply(void)
{
clear_neural_net();
count_neural_net();
// То же, что и в обучении, только без подстройки весов
// и возвращаем номер наиболее вероятной фигуры
return distance(layers[layer_count + 1].begin(),
max_element(
layers[layer_count + 1].begin(),
layers[layer_count + 1].end()));
}
Let’s analyze the recalculation of the learning coefficient:
void NeuralNet::recal_alpha(void)
{
double e = err(); // получаем ошибку
double rel_e = 2 * abs(e) / output_size; // Получаем среднее значение ошибки
// Подстраиваем коэф обучения под диапазон возможных значений alpha
alpha = rel_e * (MAX_ALPHA - MIN_ALPHA) + MIN_ALPHA;
}
Next, let’s look at the calculation of neuron values:
void NeuralNet::count_neural_net(void)
{
// Перебираем по очереди все слои
for (size_t layer = 0; layer
Well, the most difficult thing is adjusting the scales:
void NeuralNet::adj_weight(void)
{
// Сначала рассчитываем все дельты для выходного слоя, чтобы было от чего отталкиваться
for (size_t exp = 0; exp = 0; layer--)
{
// Перебираем все нейроны в слое
for (size_t input = 0; input
Conclusion
It is not such a terrible task when you break it down into smaller tasks.
Behind the brackets, the tasks of generating examples and initializing all data remained, but these are quite trivial tasks, I think everyone will be able to find a convenient solution to these problems.
Next, I will provide a link to the source code, in which the number of neurons in the network, the number of layers and the number of epochs are parameterizable values, so that you can choose the most effective values. It turned out for me that the network with only one hidden layer of 14 neurons turned out to be the most effective.
If someone helped to understand the topic, likes and subscriptions are welcome!
Source code