Artificial neural network

A neural network (also called an ANN or an artificial neural network) is a sort of computer software, inspired by biological neurons.^[1] Biological brains are capable of solving difficult problems, but each neuron is only responsible for solving a very small part of the problem. A neural network is made up of cells that work together to produce a desired result, although each individual cell is only responsible for solving a small part of the problem. This is one method for creating artificially intelligent programs.

Neural networks are an example of machine learning, where a program can change as it learns to solve a problem. A neural network can be trained and improved with each example, but the larger the neural network, the more examples it needs to perform well—often needing millions or billions of examples in the case of deep learning.

Overview

A neural network models a network of neurons, like those in the human brain. Each neuron does simple mathematical operations: it gets data from other neurons, modifies it and sends it to other neurons. Neurons are placed in "layers": a neuron from a layer receives data from the neurons of other layers, modifies it and sends data to the neurons of other layers. A neural network is made up of one or more layers.

The first layer is called the "input layer", it receives data from the outside world (for example: an image or text). The last layer is called the "output layer". The data from the neurons in the output layer is read and used as the output of the network. The other layers are called the "hidden layers".

In a simple "feed-forward" network, the data handled by the neurons are numbers. Each neuron does a weighted sum of the value of the neurons of the previous layer ( $X_{i}$ in the equation below). It then adds to it a constant value (called the "bias"). Finally, it applies a mathematical function to this value, called the "activation function". The activation function is usually a function that returns a value between 0 and 1, like tanh. The result of the activation function ( $Y_{j}$ in the equation below) is then sent to the neurons of the next layer.

$Y_{j}=Activation(\sum _{i\;\in \;Inputs}{Weight(i,j)}*X_{i}+Bias(j))$

A loss function is defined for the network.^[2] The loss function tries to estimate how well the neural network is doing at its assigned task. Finally, an optimization technique is applied to minimize the output of the cost function by changing the weights and biases of the network. This process is called training. Training is done one small step at a time. After thousands of steps, the network is typically able to do its assigned task pretty well.

Example

Consider a program that checks whether a person is alive. It checks two things - the pulse, and breath. If a person has either a pulse or is breathing, the program will output 'alive', otherwise, it will output 'dead'. In a program that does not learn over time, this would be written as:

function isAlive(pulse, breathing) {
    if(pulse || breathing) {
        return true;
    } else {
        return false;
    }
}

A very simple neural network, made of just one neuron that solves the same problem will look like this:

Single neuron which takes the values of pulse (true/false) and breathing (true/false), and outputs value of alive (true/false).

The values of pulse, breathing, and alive will be either 0 or 1, representing false and true. Thus, if this neuron is given the values (0,1), (1,0) or (1,1), it should output 1, and if it is given (0,0), it should output 0. The neuron does this by applying a simple mathematical operation to the input - it adds whatever values it has been given together, and then adds its own hidden value, which is called a 'bias'. To start with, this hidden value is random, and we adjust it over time if the neuron is not giving us the desired output.

If we add values such as (1,1) together, we might end up with numbers greater than 1, but we want our output to be between 0 and 1! To solve this, we can apply a function which limits our actual output to 0 or 1, even if the result of the neuron's math was not within the range. In more complicated neural networks, we apply a function (such as sigmoid) to the neuron, so that its value will be between 0 or 1 (such as 0.66), and then we pass on this value to the next neuron all the way until we need our output.

Learning methods

There are three ways a neural network can learn: supervised learning, unsupervised learning and reinforcement learning. These methods all work by either minimizing or maximizing a cost function, but each one is better at certain tasks.

Recently, a research team from the University of Hertfordshire, UK used reinforcement learning to make an iCub humanoid robot learn to say simple words by babbling.^[3]

References

↑ "Artificial Neural Networks - Basics | Data Basecamp". 2021-11-25. Retrieved 2022-06-15.
↑ "What is a Loss Function? | Data Basecamp". 2024-06-01. Retrieved 2024-09-17.
↑ Biever, Celeste. "Baby robot learns first words from human teacher". New Scientist.

[1] "Artificial Neural Networks - Basics | Data Basecamp". 2021-11-25. Retrieved 2022-06-15.

[2] "What is a Loss Function? | Data Basecamp". 2024-06-01. Retrieved 2024-09-17.

[3] Biever, Celeste. "Baby robot learns first words from human teacher". New Scientist.

[1]

[2]

[3]