Artificial Intelligence (AI) is an incredibly exciting area of research and development which spans mathematics, statistics, computer science, engineering, philosophy, linguistics, information theory, biology, psychology, neuroscience and others. It is also a fairly nascent area of science. What has been achieved so far are just the first steps in the journey to achieving general artificial intelligence.

The success, however, of deep learning in image recognition, natural language and games, has inspired the world to take note of AI. It has fuelled a wave of media hype and attention, that has perhaps lead to misinterpretations of what AI is designed for and what it is not.

AI as we know it today, is not capable of thought, it has no consciousness and it certainly does not have any sort of intelligence that can surpass our own. The “AI” that we experience in our mobile phones, the Internet or read about in the news however is a collection of computational and statistical techniques known as deep learning (or machine learning, depending on the scope).

So what is deep learning? For the best explanation of deep learning and its limits, I recommend this excellent two-part post by Francois Chollet, the creator of Keras. In short, deep learning is a sequence of geometric transformations (linear and non-linear), that when applied to data, may be able to statistically model the relationships contained in that data. These geometric transformations are organised in a layered network, known as a neural network.

This series is about these so-called artificial neural networks. I will attempt to uncover what they are, how they work, where they are applied and why they are called “neural networks”.

The MCP Neuron

Before there were any artificial neural networks, or even the perceptron (more on both in upcoming posts), the MCP Neuron has already existed. First proposed in 1943 by the neurophysiologist Walter S. McCulloch and the logician Walter Pitts, the McCulloch-Pitts the MCP neuron is a simple mathematical model of a biological neuron.

To understand how this model works, let’s begin with a very simplified (and certainly non-expert) explanation of a biological neuron. These are electrically excitable, interconnected nerve cells in the brain which process and transmit information through electrical and chemical signals. These neurons are all connected with each other to form a neural network in the brain. The connections between neurons are known as synapses. A single neuron in this simplified explanation consists of three parts:

  • Soma: the main part of the neuron which processes signals
  • Dendrites: branch-like shapes which receive signals from other neurons
  • Axon: a single nerve which sends signals to other neurons

The idea behind the MCP neuron is to abstract the biological neuron into a simple mathematical model. The neuron receives incoming signals as 1s and 0s, takes a weighted sum of these signals and outputs a 1 if the weighted sum is at least some threshold value or a 0 otherwise.

Formally this mathematical model can be specified as follows:

  • Let [x1, … , xm] be a vector of input signals where each xi has a value of 0 or 1
  • Let [w1, … , wm] be a vector of corresponding weights where each wi has a value of 0 or 1 or -1.
    • Input signals with a weight of 1 are called excitatory since they contribute towards a positive output signal in the sum
    • Input signals with a weight of -1 are called inhibitory since they repress a positive output signal in the sum
    • Input signals with a weight of 0 do not contribute at all to the neuron
  • Then for some threshold value t, we can define a function which outputs 1 if the weighted sum of the input signals is greater than t or 0, otherwise

The MCP Neuron is illustrated below. Note that the xi input signals are analogous to the dendrites, the activation function is analogous to the soma and the output is analogous to the axon.

McCulloch’s and Pitt’s original experiment was to see if they could use this model to construct different logic gates by simply specifying what the weights and threshold should be. It turns out the MCP Neuron can be used to model the AND, OR and NOT logic gates, as well as compositions of these three logic gates.

Here is a Python implementation that shows how the MCP Neuron can model these logic gates, as well as a more detailed explanation of the mathematical model.

Conclusion

The MCP Neuron seems almost too simple to represent AI of any kind, yet it is and it isn’t. Formal logic is a fundamental component of intelligence. For any machine to have AI, it surely should be able to comprehend logic gates. The idea being that logic gates can be stringed together to form logic circuits, capable of executing any kind of instruction. This is indeed what underpins modern computational processors. However, we know that CPUs aren’t really “intelligent” – they are just able to process any instruction given to them at lightning speed.

What makes the MCP Neuron different is that it can reproduce logic gates using a biologically inspired algorithm. In the field of AI, this is a promising achievement, since it almost surely makes sense that any kind of AI should resemble the brain – which is after all the great stage of human intelligence.

The problem with the MCP Neuron is that every logic gate which it could model (and hence every logic circuit which a collection of neurons could model) has to be pre-programmed. This is evident in the Jupyter notebook accompanying this post. This stands out as a massive contrast to the brain, which learns from experience. Nonetheless, it would take another 14 years before Frank Rosenblatt’s landmark debut of the Perceptron – the first learning algorithm of its kind. The perceptron (and hence artificial neural networks) is a direct extension of the MCP Neuron – making the MCP Neuron a cornerstone of AI, and thus marks the very beginning of deep learning.

Guest Blogger: Jonathan Sinai

RMB is a leading African Corporate and Investment Bank.

Contacts
Required
Required
Required
Required
Optional

Related

Featured