Artificial Neural Networks: Learning by Doing

Article reviewed by Grace Lindsay, PhD from New York University.

Neural networks simulate connected neurons in the brain to complete complicated prediction tasks.

©iStock, Andrii Shyp

Stay up to date on the latest science with Brush Up Summaries.

What Is an Artificial Neural Network?

An artificial neural network (ANN) is a type of machine learning model inspired by neurons in the brain.^1,2 The individual components of the ANN receive information in numerical form, process it, and send it onward in the network, analogous to neuronal signaling in the central nervous system. This process repeats until the information reaches the end of the network, and the network produces a numerical output that typically corresponds to a probability prediction about the information it received. Researchers pair this network structure with a specialized algorithm that allows the ANN to evaluate its accuracy and improve upon itself. Scientists use ANNs to identify patterns in datasets and learn how the brain itself works.^3-5

Artificial Neural Network Architecture

Scientists design ANNs to function like neurons.⁶ They write lines of code in an algorithm such that there are nodes that each contain a mathematical function, similar to neurons that each have specific biological functions in the brain. Multiple nodes form a layer in the network and a single network can contain multiple layers. Often, each node in a layer is connected to every node in the subsequent layer to send information forward in the network. “When you write code to build an artificial neural network, you're basically defining this architecture,” explained Grace Lindsay, a computational neuroscientist at New York University. She uses ANNs to study vision and climate change.

Artificial neural network layers

The first ANN layer is called the input layer because this is where the starting data enters the network. The last layer in the neural network is called the output layer, which produces values that correspond to the probability of an answer or another numerical outcome. For example, in one project, Lindsay built a neural network to detect beaver dams from aerial images. The images are the input, and the output of the model provides the probability that an image does or does not contain a beaver dam.

All of the layers between the input and output layers are called hidden layers. The number of hidden layers gives rise to the concept of deep learning, where the depth is in reference to the stacked layers in the network.⁷

How Do Artificial Neural Networks Work?

“Through the code, you give the data to the model, and then it runs and it trains,” Lindsay explained. In the case of supervised ANNs, researchers train the neural network by feeding in data with known values or features. Scientists tell the model about the input and true output values through computer code. In Lindsay’s neural network, her team gave their model thousands of images of beaver dams and the network learned to identify patterns that represent what these structures look like, with the goal of being able to pick one out in an unspecified image later. Once the trained neural network reliably predicts beaver dams from known image datasets, the scientists will use it to investigate how these installations affect climate by automatically identifying changes in new images.

Weighting and thresholding the data

Weighting and thresholding indicate how valuable a given piece of data is in an ANN. When nodes receive input values, they enter the data into their mathematical functions.⁸ These functions can include numerical weights that are multiplied by the incoming data values and threshold or bias values that are added to or subtracted from the output of each node. The threshold determines whether the calculated value is sent forward in the neural network; if the output value from a node is above the threshold, then it is sent through to the next layer of nodes.

Sometimes, these mathematical functions are summations of all the incoming data points, whereas other times, the equation includes a sigmoid function applied to the entire sum so that the resulting number lies between 0 and 1. “Because a lot of them are so simple, usually the power comes from having many, many, many artificial neurons and having deep networks,” Lindsay said.

Artificial Neural Network Architecture Example

For image analysis purposes, an image’s pixels are converted into grayscale values and each pixel becomes a numerical input that enters the neural network. The ANN sends these inputs forward to nodes in hidden layers, which consist of mathematical equations, such as sigmoid functions that produce values between 0 and 1 based on the input, weight, and bias values. If the answer to a node’s equation surpasses a set threshold, it moves forward in the network. At the end, the network contains an output layer, where numerical values indicate the probability of a particular answer related to the input dataset, such as whether an image contains a cell.

The Scientist

In this way, numerical data is passed along these nodes, which each perform the same type of mathematical equation. The answers to these equations become the input data for the next layer of nodes, which then perform the same summarizing function. At the end of the network, the final values are the output and may be interpreted as probabilities for a particular answer, such as whether or not there is a beaver dam in an image.

Learning the data

In the training phase, after the network has analyzed a large amount of trial data, it compares its calculated outputs to the true answers for each example it ran. The algorithm calculates how far off it was from the target answer for each output and averages the errors across its training set, which informs corrections to the weights or thresholds in the network. Errors in a network are called loss or cost, and researchers aim to minimize this value so that the output answers are as close as possible to correct or true answers. After the algorithm adjusts the weights and thresholds, the network runs its training data again, and compares its final outputs to the true values. This repeats until the predictions are within a predetermined value of accuracy.

In this way, the network learns how to improve its ability to interpret the given data. Once the network has optimized itself, researchers give it a new set of similar but novel data and assess how well the network predicts or identifies the desired characteristics in an unseen dataset.

Example of Training the Network

Simplified visual of how a neural network calculates the cost function in a data set and then applies this to its algorithm during training.

After an entire dataset runs through the network, the algorithm in the neural network uses provided values for what the outcomes should have been and compares it to its calculated output values. It does this for all inputs at every node, summarizes these, and then averages them. The resulting value is the cost for the network. The cost value is applied across all weights and thresholds in the network, which runs again on its training dataset and reassesses its accuracy. This repeats until a predetermined threshold of accuracy is met.

The Scientist

What Are the Limitations of Artificial Neural Networks?

Because these networks learn from the data they are trained on, they learn the patterns represented in only that dataset. If the test dataset overrepresents or excludes possible examples, such as only seeing beaver dams built on rivers, then this will bias the network to be less likely to recognize other possible data points in a group, such as a beaver dam built on a lake. Additionally, using data that a researcher groups and identifies introduces bias based upon that individual’s subjective interpretation of an image.⁹ In all of these examples, the network may not be applicable or as accurate in a more general use setting because of these learned biases. This is related to a second limitation in ANN and machine learning models which is that researchers do not know what patterns the algorithm identifies in the data to make its decisions. “You can’t ask it about its own process, and you didn’t tell it how to do its process, and so you’re in a situation where it does a process, but you don’t know how,” Lindsay said. Researchers address these problems by forming large collaborations to include representative training pools, and developing new analyses and learning methods to improve the reliability of these models, which may also help them understand ANN decision-making.^10,11

How Are Artificial Neural Networks Used?

ANNs are behind language models such as ChatGPT. People also use them to identify patterns that humans may not recognize on their own and to model how the brain itself works by studying how ANNs learn.^12-14 Additionally, scientists use ANNs in a variety of image-related tasks. “That has applications in everyday life, like on the internet you can auto-summarize an image, or a computer can recognize people in images,” Lindsay explained. “It also has applications in science; these things get applied to medical imaging either in a healthcare sense or [for labeling] microscope images of cells and helping scientists make their work go faster by doing some of those somewhat menial tasks.”

References

Han S-H, et al. Artificial neural network: Understanding the basic concepts without mathematics. Dement Neurocogn Disord. 2018;17(3):83-89
Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science. 2015;349(6245):255-260
Tarca AL, et al. Machine learning and its applications to biology. PLoS Comput Biol. 2007;3(6):e116
Kimeswenger S, et al. Artificial neural networks and pathologists recognize basal cell carcinomas based on different histological patterns. Mod Pathol. 2021;34(5):895-903
Hannagan T, et al. Emergence of a compositional neural code for written words: Recycling of a convolutional neural network for reading. Proc Natl Acad Sci. 2021;118(46):e2104779118
Cohen Y, et al. Recent advances at the interface of neuroscience and artificial neural networks. J Neurosci. 2022;42(45):8514-8523
LeCun Y, et al. Deep learning. Nature. 2015;521:436-444
Yang GR, Wang X-J. Artificial neural networks for neuroscientists: A primer. Neuron. 2020;107(6):1048-1070
Cronin NJ. Using deep neural networks for kinematic analysis: Challenges and opportunities. J Biomech. 2021;123(23):110460
Nensa F, et al. Artificial intelligence in nuclear medicine. J Nucl Med. 2019;60(2):29S-37S
Lee H, et al. Fully automated deep learning system for bone age assessment. J Digit Imaging. 2017;30:427-441
Marabini R, Carazo JM. Pattern recognition and classification of images of biological macromolecules using artificial neural networks. Biophys J. 1994;66:1804-1814
Angermueller C, et al. Deep learning for computational biology. Mol Syst Biol. 2016;12(7):878
Beniaguev D, et al. Single cortical neurons as deep artificial neural networks. Neuron. 2021; 109:2727-2739

Artificial Neural Networks: Learning by Doing

What Is an Artificial Neural Network?