Step 9. Character recognition neural network

This is the most ambitious AI program of the entire set. This is a neural network to do character recognition from hand-writing.

Note the video mentions that the theory behind some things is explained in the "course notes". These notes (part of an M.Sc. course) are not provided here. You will have to learn AI theory separately. This is just a series of AI programming exercises.

Video

World

Character recogn...
Neural network to do character recognition from image database and from handwriting in real-time.

Notes

  • The network learns from a database of images paired with the correct answer of what character is written in the image.
  • After a while, it can recognise new images from the database, that it never saw before, with 95 percent accuracy.
  • Then it tries to use that knowledge to recognise new handwriting in real time. You can write characters in the browser and the network will recognise them with accuracy of maybe 60 percent.
  • This despite the fact that this network knows literally nothing at the start of the run. Neural networks are amazing!


Credits


Running the program

There are three sections:
  1. Doodle: Draw your own image. You can draw a "doodle" of a digit in this area and the network will try to classify it.
  2. Training: This is where training takes place. The network shows the exemplars it is training on. They flash by. It also runs an ongoing test of how well it is doing out of 100 percent accurate classification. (It does not display the test exemplars.)
  3. Demo: A random test demo. Picks a random image and tests if the network can classify it. These images are not from the training set. They are from a separate test set. The network has not been trained on these images. So these are new, unseen before images for the network.


Notes on the code

  • The images are tiny: 28 x 28 pixels, greyscale integer values 0 to 255.
  • We convert these pixels to (28 squared = 784) neural network input nodes, taking real number values 0 to 1.
  • There are 10 output nodes. They generate output for each character 0 .. 9. The highest value is our "guess" as to which of the 10 this is.

  • Matrix.randomize() in matrix.js is edited so it calls a function randomWeight() that we define in the World.


Fetching the exemplars

Defining the exemplars and fetching them into the JavaScript program is one of our big issues.
  • First, JS on a website cannot read local files on the client. (Thankfully!) So the data must be on the Ancient Brain server.

  • Binary file on Ancient Brain:
    • Ancient Brain allows the upload of lots of file formats, such as JSON.
    • The MNIST data is in its own binary format, explained here. It does not use any file extension.
    • For this port, I made an exception to allow the MNIST data on Ancient Brain without any file extension.
    • Then I realised that I allow file extensions for 3D models that could be re-used to upload any binary data. An example would be .bin files used for 3D models. You can rename MNIST or other binary data to .bin and it should upload.
    • Alternatively, convert it to JSON.

  • Fetching the data from JS
    • The next issue is JS reading this binary file.
    • Daniel Shiffman writes a file mnist.js to do this.
    • This uses fetch to get the files from local server.
    • This is now edited to point to the Ancient Brain files


Blurring the doodle

To try to improve doodle recognition, we made a change to make the doodle more like the MNIST images.
  • In the original Coding Train code, the doodle produces sharp images with no blurry edges. This leads to neural network input nodes which are all either 0 or 1, with no values in between.
  • But in the MNIST data, pixels are shades of grey and the neural network inputs end up with real numbers ranging from 0 to 1.
  • Perhaps we need to make the doodles more like the MNIST images.
  • The change we made is to "blur" the edges of the doodle using filter(BLUR)
  • To see how this changes the input nodes, we added a debug function so you can see in the console the exact input nodes for the doodle and for a demo image. Type this in the console:
     showInputs(demo_inputs);
     showInputs(doodle_inputs);
    
    You will see that the input nodes are now very similar.


Results

  • Test set: 95 percent accuracy
    • The neural network is incredibly accurate in a very short time. At least, over the test set.
    • You can see this illustrated by trying lots of samples in the "Demo" section.
    • After a very short time, it gets over 90 percent accuracy in classifying new images from the test set. After a long time, it gets to about 95 percent.

  • Doodle: 60 percent accuracy
    • The doodle recognition is harder, though.
    • I recommend that you wait a while to try a doodle. Wait until the program has got up to at least 90 percent on the test data.
    • After a long time, it gets to about 60 percent accuracy on the doodles.
    • Considering random is 10 percent, this is a lot better than random. But not at all near the accuracy for the test set.
    • Still, consider what it is doing. It is recognising, at a level much better than chance, handwriting, and even fairly random doodling, done at run-time by a person it has never met before. Not bad.

Exercise

Clone and Edit the World.
  1. First, look at the data set. Data is loaded into the "mnist" object. View console to explore this object. Regular array-like structure.
  2. Examine the code to see how one entry of the array (an array of pixels) is transformed into an image to display in different sizes on the canvas.

  3. Turn learning off:
    • It is hard to see, because it happens so fast, but the network starts random with random predictions, but after only a few hundred exemplars it gets pretty good.
    • To see that the network starts random, change "do_training" to false or TRAINPERSTEP to zero. That is, no training. Network stays random. Results are random. (Meaning what?)
    • Or you can let it learn but slow it down with TRAINPERSTEP = 1, TESTPERSTEP = 50. Then it learns slower, and you can see it improve.

  4. See what happens when you:
    • Reduce hidden nodes to 5.
    • Reduce hidden nodes to 1.
    • Increase hidden nodes to 200.
    • Why?

  5. See what happens when you:
    • Change randomWeight to return zero.
    • Change randomWeight to return constant.
    • Why?


Can you improve doodle recognition?

  • After a long time, the doodle recognition gets to about 60 percent accuracy.
  • Your challenge is to improve this.

  • What could we do?
    1. Change the constants at the top of the code.
    2. Implement a Convolutional neural network.
    3. Research MNIST more.
  • Best of luck!


Save and restore

  • Your network does not have to learn from scratch on each run.
  • You can learn for a while and then use AB save and restore to save the weights.
  • A new run can then load up the weights and start recognising doodles immediately.


Tweet this step: