Neural Network and Multi-Layer Perceptron (MLP)


In-depth Articles

The neural network is a computing system (or processor) born from the idea of artificially simulating the behavior of the human brain, to exploit its characteristics of We can precisely define a neural network as a parallel processor composed of individual computing units, called neurons, which has a natural predisposition to store experimentally acquired knowledge and make it available for use. Its similarity to the human brain refers to:

The use of networks therefore allows the possibility of generalizing the knowledge acquired during the learning phase.

The idea underlying neural networks consists of the neuron which, based on the functioning of the human brain, is an information processing unit.

The neuron consists of a unit that receives as input a numerical value consisting of a weighted sum of different signals, and processes it by activating or remaining inactive depending on whether the activation threshold is exceeded or not.

More specifically, the artificial neuron uses the input data as an argument for a function, called the activation function \(s()\) which returns as output the values 0, 1 or a value in the range [0, 1].

Therefore, the neuron is characterized by the function and the activation threshold. The latter is usually introduced through a constant input equal to 1, modulated by a coefficient called bias, whose effect is to control the translation of the activation threshold with respect to the origin of the signals. Formally, the bias plays a role no different from that of the weights, which act as regulators of the intensity of the emitted or received signal.

It is common practice to adopt different activation functions, depending on the role that the neuron and the neural network are intended to play. The most common activation functions are:

Assuming a simple neural network:



To train this simple network for a binary classification problem, assuming that the two input variables are \(X_1\) e \(X_2\) and that the output is a dichotomous variable \(Y\) two processes called Back-propagation and Forward-propagation must be adopted:

Therefore, in neural networks, we propagate forward to obtain the output and compare it with the real value to get the error, and subsequently, to minimize the error, we propagate backward by finding the derivative of the error with respect to each weight and subtracting this value weighted by the learning rate from the weight value.

The algorithm stops either when the minimum acceptable error is reached or when the maximum number of epochs established during the design phase is reached (to prevent the training process from running indefinitely). An epoch is a complete update cycle of all weights, feeding the entire training set as input.

Deep learning is that field of machine learning research based on multiple levels of representation, corresponding to hierarchies of features, factors or concepts, where high-level concepts are defined on the basis of lower-level ones. In other words, it refers to a set of techniques based on artificial neural networks organized in multiple layers, where each layer computes values for the next so that information is processed in an increasingly complete manner.

A Multi-Layer Perceptron network (MLP) combines multiple processing levels using artificial neurons. It is composed of an input layer, one or more hidden layers, and an output layer. The layers are interconnected through nodes or neurons, with each layer using the output of the previous layer as input.



The idea behind this method is exactly the one explained previously in the construction of a simple network, but instead of neurons, perceptrons are used. The perceptron is a simple neuron equipped with a "teacher" circuit for learning: