An introduction to neural networks – Kevin Gurney & University of Sheffield

This book grew out of a set of course notes for a neural networks module given as part of a Masters degree in “Intelligent Systems”. The people on this course came from a wide variety of intellectual backgrounds (from philosophy, through psychology to computer science and engineering) and I knew that I could not count on their being able to come to grips with the largely technical and mathematical approach which is often used (and in some ways easier to do). As a result I was forced to look carefully at the basic conceptual principles at work in the subject and try to recast these using ordinary language, drawing on the use of physical metaphors or analogies, and pictorial or graphical representations. I was pleasantly surprised to find that, as a result of this process, my own understanding was considerably deepened; I had now to unravel, as it were, condensed formal descriptions and say exactly how these were related to the “physical” world of artificial neurons, signals, computational processes, etc. However, I was acutely aware that, while a litany of equations does not constitute a full description of fundamental principles, without some mathematics, a purely descriptive account runs the risk of dealing only with approximations and cannot be sharpened up to give any formulaic prescriptions. Therefore, I introduced what I believed was just sufficient mathematics to bring the basic ideas into sharp focus.

To allay any residual fears that the reader might have about this, it is useful to distinguish two contexts in which the word “maths” might be used. The first refers to the use of symbols to stand for quantities and is, in this sense, merely a shorthand. For example, suppose we were to calculate the difference between a target neural output and its actual output and then multiply this difference by a constant learning rate (it is not important that the reader knows what these terms mean just now). If t stands for the target, y the actual output, and the learning rate is denoted by a (Greek “alpha”) then the output-difference is just (t-y) and the verbose description of the calculation may be reduced to (t-y). In this example the symbols refer to numbers but it is quite possible they may refer to other mathematical quantities or objects. The two instances of this used here are vectors and function gradients. However, both these ideas are described at some length in the main body of the text and assume no prior knowledge in this respect. In each case, only enough is given for the purpose in hand; other related, technical material may have been useful but is not considered essential and it is not one of the aims of this book to double as a mathematics primer.

The other way in which we commonly understand the word “maths” goes one step further and deals with the rules by which the symbols are manipulated. The only rules used in this book are those of simple arithmetic (in the above example we have a subtraction and a multiplication). Further, any manipulations (and there aren’t many of them) will be performed step by step. Much of the traditional “fear of maths” stems, I believe, from the apparent difficulty in inventing the right manipulations to go from one stage to another; the reader will not, in this book, be called on to do this for him- or herself. One of the spin-offs from having become familiar with a certain amount of mathematical formalism is that it enables contact to be made with the rest of the neural network literature. Thus, in the above example, the use of the Greek letter may seem gratuitous (why not use a, the reader asks) but it turns out that learning rates are often denoted by lower case Greek letters and a is not an uncommon choice. To help in this respect, Greek symbols will always be accompanied by their name on first use.

In deciding how to present the material I have started from the bottom up by describing the properties of artificial neurons (Ch. 2) which are motivated by looking at the nature of their real counterparts. This emphasis on the biology is intrinsically useful from a computational neuroscience perspective and helps people from all disciplines appreciate exactly how “neural” (or not) are the networks they intend to use. Chapter 3 moves to networks and introduces the geometric perspective on network function offered by the notion of linear separability in pattern space. There are other viewpoints that might have been deemed primary (function approximation is a favourite contender) but linear separability relates directly to the function of single threshold logic units (TLUs) and enables a discussion of one of the simplest learning rules (the perceptron rule) i n Chapter 4. The geometric approach also provides a natural vehicle for the introduction of vectors. The inadequacies of the perceptron rule lead to a discussion of gradient descent and the delta rule (Ch. 5) culminating in a description of backpropagation (Ch. 6). This introduces multilayer nets in full and is the natural point at which to discuss networks as function approximators, feature detection and generalization.

This completes a large section on feedforward nets. Chapter 7 looks at Hopfield nets and introduces the idea of state-space attractors for associative memory and its accompanying energy metaphor. Chapter 8 is the first of two on self-organization and deals with simple competitive nets, Kohonen self-organizing feature maps, linear vector quantization and principal component analysis. Chapter 9 continues the theme of self-organization with a discussion of adaptive resonance theory (ART). This is a somewhat neglected topic (especially in more introductory texts) because it is often thought to contain rather difficult material. However, a novel perspective on ART which makes use of a hierarchy of analysis is aimed at helping the reader in understanding this worthwhile area. Chapter 10 comes full circle and looks again at alternatives to the artificial neurons introduced in Chapter 2. It also briefly reviews some other feedforward network types and training algorithms so that the reader does not come away with the impression that backpropagation has a monopoly here. The final chapter tries to make sense of the seemingly disparate collection of objects that populate the neural network universe by introducing a series of taxonomies for network architectures, neuron types and algorithms. It also places the study of nets in the general context of that of artificial intelligence and closes with a brief history of its research. The usual provisos about the range of material covered and introductory texts apply; it is neither possible nor desirable to be exhaustive in a work of this nature. However, most of the major network types have been dealt with and, while there are a plethora of training algorithms that might have been included (but weren’t) I believe that an understanding of those presented here should give the reader a firm foundation for understanding others they may encounter elsewhere.

Related posts:

Introduction to Deep Learning Business Application for Developers - Armando Vieira & Bernardete Ribe...
Python Data Structures and Algorithms - Benjamin Baka
Data Science and Big Data Analytics - EMC Education Services
Deep Learning with Python - A Hands-on Introduction - Nikhil Ketkar
Python Machine Learning Cookbook - Practical solutions from preprocessing to Deep Learning - Chris A...
Deep Learning - A Practitioner's Approach - Josh Patterson & Adam Gibson
Artificial Intelligence - 101 things you must know today about our future - Lasse Rouhiainen
Applied Text Analysis with Python - Benjamin Benfort & Rebecca Bibro & Tony Ojeda
Python for Programmers with introductory AI case studies - Paul Deitel & Harvey Deitel
Python Machine Learning Third Edition - Sebastian Raschka & Vahid Mirjalili
Deep Learning from Scratch - Building with Python form First Principles - Seth Weidman
Learning scikit-learn Machine Learning in Python - Raul Garreta & Guillermo Moncecchi
Python Deep Learning - Valentino Zocca & Gianmario Spacagna & Daniel Slater & Peter Roelants
Python Deeper Insights into Machine Learning - Sebastian Raschka & David Julian & John Hearty
Python Machine Learning - Sebastian Raschka
Deep Learning with Keras - Antonio Gulli & Sujit Pal
Scikit-learn Cookbook Second Edition over 80 recipes for machine learning - Julian Avila & Trent Hau...
Deep Learning with Theano - Christopher Bourez
Java Deep Learning Essentials - Yusuke Sugomori
Deep Learning in Python - LazyProgrammer
Superintelligence - Paths, Danges, Strategies - Nick Bostrom
Statistical Methods for Machine Learning - Disconver how to Transform data into Knowledge with Pytho...
Introduction to Scientific Programming with Python - Joakim Sundnes
Python Artificial Intelligence Project for Beginners - Joshua Eckroth
Neural Networks and Deep Learning - Charu C.Aggarwal
Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow - Aurelien Geron
Deep Learning dummies first edition - John Paul Mueller & Luca Massaron
Python Machine Learning Eqution Reference - Sebastian Raschka
Fundamentals of Deep Learning - Nikhil Bubuma
Deep Learning Illustrated - A visual, Interactive Guide to Arficial Intelligence First Edition - Jon...
Generative Deep Learning - Teaching Machines to Paint, Write, Compose and Play - David Foster
Deep Learning dummies second edition - John Paul Mueller & Luca Massaronf