What Are Neural Networks? | Dissecting The Artificial Brain

    How can a machine that lacks a brain differentiate an apple from a cherry? Or tell what number a handwritten digit is? Or detect fraud and predict stock prices? It is the fascinating world of artificial intelligence that gives machines human abilities. There are many ways to do so, one of which is deep learning. It’s the process of building artificial neural networks inspired by the structure of the human brain to train computer systems. It’s like giving a brain to a brainless machine. While it’s more calculus than a sci-fi excerpt, it’s still very interesting to dive into. So, what are neural networks?

    What Are Neural Networks?

    Neural networks, also known as artificial neural networks, are computational models that form the core of deep learning algorithms. The human neural network inspires their structure. 

    The brain is actually made up of billions of neurons. Let’s assume there are only two. When one neuron receives a message, the message passes through it and reaches nerve endings. Then, nerve endings pass the message to the following neuron through a synapse. The synapse is the space between two neurons that let them communicate. 

    Artificial neural networks are similar. They’re made up of neurons structured into multiple layers. One neuron, also known as a node, receives a message (an input) and then passes it to another neuron in the following layer through channels that connect both. Neural Network Human vs Artificial

    The message transfer through neural networks is a bit more complex than that. We’ll get to how neural networks work but before let’s understand how they became a thing. 

    Neural Network History

    Although the concept of smart machines has existed for centuries, the centered focus on neural networks intensified in the past 100 years. Let’s dive into the neural networks’ history. History of Neural Networks

    • 1943: Warren S. McCulloch and Walter Pitts published a paper entitled “A Logical Calculus of the Ideas Immanent in Nervous Activity”. The research explained how the brain produces complex patterns that can be simplified down to a binary logic structure with only true/false connections. 
    • 1958: Frank Rosenblatt developed the first neural network, the perceptron, documented in his research. He enabled a computer to learn how to distinguish between cards marked on the left and cards marked on the right. 
    • 1959: Bernard Widrow and Marcian Hoff developed models called “Adaline” and “Madaline”. Madaline was the first neural network applied to real-life problems. It’s an adaptive filter that eliminated echoes on phone lines and it’s still in use today. 
    • 1974: Paul Werbos applied backpropagation within neural networks.
    • 1986: Rummelhart et al. proposed the Multilayer Perceptron – a multilayer neural network.
    • 1989: Yann LeCun published a paper demonstrating how the use of constraints in backpropagation and its integration into the neural network architecture can be used to train algorithms. 
    • 1997: Schmidhuber & Hochreiter proposed a recurrent neural network framework (long short-term memory).
    • 2018: Jacob Devlin and his colleagues from Google published BERT, a transformer-based model for Natural Language Processing
    • 2020: OpenAI published GPT-3, a deep learning model that produces human-like text. 
    • 2022: OpenAI published ChatGPT, an advanced chatbot

    How Does A Neural Network Work?

    We briefly explained what are neural networks and the history that led to where we are today. Now, let’s see how a neural network works. 

    Neural Network Structure 

    A neural network is made up of neurons structured into multiple layers. A neuron is a mathematical function that receives an input, processes it, and generates an output. Neural Network


    As you can see above, the neural network has an input layer, an output layer, and hidden layers in between. The input layer is the first layer of a neural network to which you feed the data as input (x1). The output layer is the last layer of the network and gives us the output. The hidden layers are where all the magic happens. The number of hidden layers varies depending on the neural network as well as the number of neurons. 

    Channels, Weights, and Biases

    The channels are the connections through which neurons from a layer connect with neurons from another layer. Each channel has a “weight” (w1) which refers to how important the input is in regards to the output we want.  

    A bias (b1) refers to how easy it is to get a neuron to fire (give an output). If the bias is big, it means that it’s very easy for the node to give us an output. And, a low bias indicates that it’s difficult for the node to do so. 

    Steps of Training a Neural Network

    Now you’re versed on what each element of a neural network is, it’s time you understand how it works. Let’s suppose we want our model to recognize the number 9. How would it do that? 

    Neural Network Training 

    The neural network has to understand what makes the number 9 a 9 and not an 8. So, it has to pick up the “features” of the number 9. A 9 has a loop at the top, and a vertical stroke in the bottom right. Number 9

    In the case of machine learning algorithms, you’d have to feed the images and the features. However, in the case of deep learning, neural networks automatically extract these features. What you have to do, though, is provide enough information to increase the accuracy of your model.

    So, if you want your model to recognize the number 9, you would have to gather and feed the model multiple pictures of number digits. This is called “training a neural network” when you’re feeding it data to give you the desired output. 

    Data Input

    We agreed that you’d have to feed the model multiple pictures of the number 9. Let’s narrow it down to feeding it one image of the digit 9. The number is present as 28 x 28 image pixels which amounts to a total of 784 pixels. Pixelated Number 9

    Each pixel is fed to a neuron in the first layer (input layer) in the neural network. This means that we would have 784 neurons in the first layer. The inputs would be referred to as x1, x2, x3, etc. The data passes through forward propagation – from the first layer to the last.Feeding Data to A Neural Network

     Data Processing

    The first layer now has 784 neurons each having a pixel as an input. How is that data passed to the next neurons in the following layer? 

    Each input is assigned a “weight” which represents the influence of this input on the desired output. This means how important the input x1 is for us to get the final output as the digit 9. And, the neurons are assigned with “bias” which represents how easy it is for the neuron to give out an output. 

    But, how does a neuron give an output? The weighted sum of the inputs plus the bias is applied to a mathematical function called the “activation function” which determines whether or not a node fires. This is done throughout all layers up until the final one. Neurons Firing [activation function]

    The final layer has 10 neurons, each consisting of one number digit from 0 to 9. The neuron that lights up gives us our output.

    Neural Network Output

    Enhancing Neural Network’s Results

    Let’s say we feed the model an image of the digit 9 and it gives the number 8 as an output. Does this mean our neural network isn’t working and we should build another one? Nope. It means that our model needs enhancement. 

    After the result, we calculate the “cost function” which is the fancy name for “error”. It is the squared difference between our expected output and the output we got. Then, we have to tweak the weights and biases in order to minimize the error. 

    Cost Function

    How Do We Optimize the Cost Function?

    How do we change the weights and biases to minimize errors? We use what is known as “the gradient descent”. Gradient descent is a standard optimization algorithm. It refers to changing the parameters (weights and biases) for the cost function to reach the lowest point of the slope indicating minimal error.Gradient Descent

    After deciding on what weights and biases you want to enhance, you feed the neural network these changes through backward propagation. Backward propagation is when you’re going from the last layer to the first layer. You keep doing this until you reach maximum or close to maximum accuracy. How Does A Neural Network Work

    Types of Neural Networks

    There are different types of neural networks that are used for different data and applications. 


    The perceptron model, invented in 1958, is the simplest and models neural networks model. It consists of the neuron that classifies the data into two categories. It works the same way as we described above:Perceptron

    • Neuron receives inputs
    • Sums them up
    • Applies activation function
    • Gives an output

    Application of Perceptrons 

    • Data classification


    • Simple
    • Easy to understand, implement, and train
    • Performs well on problems that are linearly separable (logical operations, linear regression, and binary classification)


    • Limited expressive power and generalization ability 
    • Cannot process non-linear data
    • Prone to overfitting and noise

    Feedforward Neural Networks

    A feedforward neural network is the simplest form of neural network after the perceptron. It’s made of two layers: an input layer and an output layer. Sometimes hidden layers are present in between but not necessarily as it depends on the use. 

    The data is fed in one direction only and never backward, hence the name. This means that weights are not updated as there is no backpropagation.Feedforward Neural Network

    Applications of Feed Forward Neural Networks

    • Simple classification
    • Face Recognition
    • Speech Recognition


    • Simple to contain
    • Fast
    • Equipped to deal with data that contains a lot of noise


    • Can’t update weights
    • Can’t use it for deep learning

    Convolutional Neural Networks

    Convolutional neural networks contain a three-dimensional arrangement of neurons instead of the usual two-dimensional arrangement. They’re formed of multiple layers:

    • Input layer: Responsible for taking in data as inputs
    • Convolution layers: Responsible for feature extraction (they produce maps)
    • Pooling Layer: Responsible for the aggregation of maps produced from the convolutional layer
    • Fully connected layer and output layer: Responsible for giving out outputsConvolutional Neural Network

    Applications of Convolutional Neural Networks

    • Image processing
    • Computer vision
    • Speech recognition
    • Machine translation


    • Efficient for deep learning
    • Fewer parameters needed to learn in comparison with fully connected layers


    • Complex 
    • Hard to maintain
    • Slower than other networks depending on the number of layers

    Recurrent Neural Networks

    Recurrent neural networks are characterized by their ability to use information from previous inputs to influence the current inputs and outputs. So, the information cycles through a loop, unlike feed-forward networks where the information goes in a forward direction only. Recurrent Neural Network

    Applications of Recurrent Neural Networks:

    • Text-to-speech processing
    • Text processing like auto-suggest and grammar checks
    • Sentiment analysis
    • Translation


    • Processes sequential data where each sample can be assumed dependent on previous ones
    • Can plan out several inputs and productions


    • Difficult to train
    • Difficult to process long sequential data 

    LTSM – Long short-term memory

    Long short-term memory networks are like recurrent neural networks with the addition of memory cells. These cells can store information for long periods of time. LTSM networks use three gates:

    • Input gate: Controls what data should be kept in memory
    • Output gate: Controls the data given to the next layer
    • Forget gate: Controls what data to dump and forgetLong short-term memory

    Applications of LTSM

    • Gesture recognition
    • Speech recognition
    • Text prediction


    • Good at handling long-term dependencies
    • Less susceptible to the vanishing gradient problem
    • Very efficient at modeling complex sequential data


    • More complicated than RNNs
    • Requires more training data
    • Doesn’t work well with highly non-linear data and data with a lot of noise

    Applications of Neural Networks

    We briefly touched upon how we can use each type of neural networks. But, let’s actually see how common the use of these networks is. Application of Neural Networks

    • Security: These networks are the core of facial recognition systems of surveillance. They match human faces with the digital images in their database. Offices commonly use these systems for selective entries. 
    • Finance: Neural networks are used to predict stock market prices.
    • Marketing: Neural networks are the core of social media’s algorithms that make sure you get ads fit to your needs and your taste. 
    • Defense: The USA, Britain, and Japan among other countries use NN for developing solid defense strategies. They also use them for air patrols, maritime patrols, and controlling automated drones. 
    • Healthcare: Health professionals are using NN in image processing to detect cancer and other anomalies. They also use them to keep track of patients’ data. 
    • Weather forecasting: NN is used to help predict the weather as well as possibilities for natural disasters. 

    Advantages and Disadvantages 

    Artificial neural networks are definitely revolutionary and participated in the massive expansion of many fields but they also have their own set of limitations.Advantages and Disadvantages of Neural Networks


    • High learning ability: NNs are capable of learning, making patterns, and adapting to new situations.
    • Capability of handling non-linear relationships: NNs can learn non-linear relationships which is especially useful in image and speech recognition 
    • Ability to tolerate faults: Neural networks continue to function even if some neurons are no longer working. 
    • Parallel processing: NNs can handle many calculations at the same time which is why they’re able to process large datasets. 
    • Ability to generalize: NNs learn from their inputs and apply what they learn to new data. So, they make accurate predictions based on their own dataset. 


    • Overfitting: Some networks fail to generalize new data and can only process training data.
    • High computational power needed: NNs require high computational power and time, especially for large data sets. This can be a disadvantage for those who have limited resources. 
    • Large training data required: NNs require large data sets to perform with high accuracy. If the dataset is small or biased, it would affect the model’s performance. 
    • Limited interpretability: Developers often face the black box problem with neural networks as they don’t understand how they arrive at their conclusions. This can be problematic when they need to know how the network reached a certain decision to fix it. 
    • Noise sensitivity: Noise in data can lead to inaccurate predictions and classifications. 

    Finally, there is still plenty of room for research and development of neural networks. But, so far we seem to be on the right track.


    Please enter your comment!
    Please enter your name here

    Stay in the Loop

    Stay in the loop with blockchain Witcher and get the lastest updates - Best Web Hosting


    Latest stories

    You might also like...