For the neural net beginner, an introduction to all the concepts and abstractions you need to know in order to gain an intuitive understanding of how these crafty little neural nets learn to do anything at all. We’ll discuss practical tips on preparing training data, and strategies for solving various kinds of problems. We’ll discuss:

- Review of how a backpropagation neural net works
- Type and range of the input and output data
- Preparing training data
- Training strategies
- Adjustable parameters for learning rate and momentum
- Visualizing how a NN solves a problem – a geometric interpretation

Video link –> http://vimeo.com/technotes/neural-net-care-and-training

The comment form is disabled because of the amount of spam we’re receiving. To post a comment, send email to dave at millermattson.com.

Jitesh V. wrote:

i gone through u r tutorial.. its was just awesome…

brilliantly explained…

but instead of using tan as a transfer function , if u use sigmoid function for both transfer function and transfer function derivative then it will give more strong result..

thats it… thanks for the video…

Link | January 28th, 2013 at 6:07 pm

Umbo wrote:

Greetings.

Your excellent explanations left me with only 2 questions.

—–

I understand that there’s virtually no limit to the input’s complexity, but you haven’t made examples of outputs with more than 2 states.

[Question 1]

Are neural nets limited to a 2-state answer? As in a Yes/No, True/False, Do/Don’t?

Is this because outputs are best taken as “ranges” of values (ie: anything > 0 is true; anything < 0 is false)?

—–

Each neuron provides its own contribution to the final result. Given a number of neurons and the number of their connections with adjacent neurons, the resulting net can only form so many possible patterns.

Am I correct if I say that to efficiently learn from a *big* dataset I must provide an adeguatelty *big* amount of neurons?

As your example with pictures of cookies explained, an elaborate input should be reduced in complexity (where applicable) before being fed to a neural net.

But still, a simplified input may present itself in many variants, thus requiring the mapping of a great deal of patterns to the same solution.

[Question 2]

Is there a formula that can estimate the number of neurons to employ?

An example might clarify.

Consider the [already simplified] picture of a cookie. Downsampled to 10×10 pixels and made grayscale. So we have 100 pixels, each able to hold any of 256 different states. Thus the amount of possible combos is 100 to the power of 256, or 1.0e+512 unique inputs for a perfect training.

Of course that's unthinkable.

Although only a tiny fraction of that will fall in the category of "it's a good cookie" (let's say, 100 millions of pictures?), we still have those many possible inputs.

How many neurons would you employ in such a case?

Thank you very much.

Link | April 2nd, 2013 at 3:22 pm

Dave wrote:

Hi Umbo, those are excellent questions. You can use as many output neurons as you need, and each one can be treated as binary or analog. For example, if the input is a satellite image and the output is a location on the earth’s surface, you could use two analog output neurons representing latitude and longitude. Or if the output is supposed to be a color, you could use three analog output neurons representing hue, saturation, and intensity (or red, green, and blue). (When treating an output neuron as analog, just keep the output values in the range of the linear portion of the activation function.)

For a pattern classification problem where you’re trying to identify one of N possible input patterns, you could have N output neurons, one for each possible pattern. During training, you can treat the outputs as binary, meaning you try to get the net to output a high value on only the output that represents the best match. After training, you can treat the outputs as binary, or as analog, where the output with the greatest value represents the nearest match and the confidence of the match.

A neural net is a curve-fitting machine. You’re trying to train the net to map all the possible input states to all the possible output states, with some degree of precision that you have determined you need. The number of internal neurons determines how precise and how complex the curve fit can be. For an analogy, think of a bunch of points plotted on an XY grid, and you’re trying to devise a mathematical curve that passes through the points. A linear fit (y = mx + b) makes a pretty coarse fit, a binomial curve (y = ax^2 + bx + c) can make a closer fit, and a higher order spline can fit the data even more closely. It’s not about the size of the data set you’re fitting, but more about how complex the shape of the curve and how precisely you’re trying to fit it. In the same way that you need higher order equations to fit finer details, you need more neurons in the net to make a more precise mapping of input states to outputs. The exact number depends on how complex and precise the curve-fit needs to be, and how non-linear the mapping is. Most programmers approach this by trial and error, because any more rigorous analysis requires some heavy-duty statistics and information theory.

One approach is to start with as many input and output neurons as needed to represent the input and output information, and see how training goes. You’ll quickly develop an intuition for it. If the training results in poor precision on the outputs, add a hidden layer of neurons and see if that helps the precision. There is, however, some point of diminishing return, where too many internal neurons can cause other problems. A few experiments will help determine upper and lower bounds for the optimum number of hidden neurons.

Link | April 2nd, 2013 at 5:56 pm