1

For my APCS final project, I'm making an application that:

  • allows a user to draw digits on a drawing panel;
  • scales/translates each stroke (represented by a list of x-y coordinates) to 100x100;
  • produces an image from the scaled stroke;
  • produces a binary 2D array from that image (0 for white, else 1);
  • and passes that binary array to a neuron object for character recognition.

The following class represents the neuron:

import java.awt.*;
import java.util.*;
import java.io.*;

public class Neuron
{
    private double[][] weights;
    public static double LEARNING_RATE = 0.01;

    /**
     *Initialize weights
     *Assign random double values to weights
     */
    public Neuron(int r, int c)
    {
        weights = new double[r][c];

        PrintWriter printer = null;
        try
        {
            printer = new PrintWriter("training.txt");
        }
        catch (FileNotFoundException e) {};
        for (int i = 0; i < weights.length; i++)
        {
            for (int j = 0; j < weights[i].length; j++)
            {
                weights[i][j] = 2 * Math.random() - 1; //Generates random number between -1 and 1
                if (j < weights[i].length - 1)
                    printer.print(weights[i][j] + " ");
                else
                    printer.print(weights[i][j]);
            }
            printer.println();
        }
        printer.close();
    }

    public Neuron(String fileName)
    {
        File data = new File(fileName);
        Scanner input = null;
        try
        {
            input = new Scanner(data);
        }
        catch (FileNotFoundException e)
        {
            System.out.println("Error: could not open " + fileName);
            System.exit(1);
        }

        int r = Drawing.DEF_HEIGHT, c = Drawing.DEF_WIDTH;
        weights = new double[r][c];

        int i = 0, j = 0;
        while (input.hasNext())
        {
            weights[i][j] = input.nextDouble();
            j++;
            if (j > weights[i].length - 1)
            {
                i++;
                j = 0;
            }
        }

        for (double[] a : weights)
            System.out.println(Arrays.toString(a));

    }

    /**
     *1. Initialize a sum variable
     *2. Multiply each index of weights by each index of bin
     *3. Sum these values
     *4. Return the activated sum
     */
    public int feedforward(int[][] bin) //bin represents 2D array of binary values for a binary image
    {
        double sum = 0;
        for (int i = 0; i < weights.length; i++)
        {
            for (int j = 0; j < weights[i].length; j++)
                sum += weights[i][j] * bin[i][j];
        }
        return activate(sum);
    }

    /**
     *1. Generate a sigmoid (logistic) value from a sum
     *2. "Digitize" the sigmoid value
     *3. Return the digitized value, which corresponds to a number
     */
    public int activate(double n)
    {
        double sig = 1.0/(1+Math.exp(-1*n));
        int digitized = 0;

        if (sig < 0.1)
            digitized = 0;
        else if (sig >= 0.1 && sig < 0.2)
            digitized = 1;
        else if (sig >= 0.2 && sig < 0.3)
            digitized = 2;
        else if (sig >= 0.3 && sig < 0.4)
            digitized = 3;
        else if (sig >= 0.4 && sig < 0.5)
            digitized = 4;
        else if (sig >= 0.5 && sig < 0.6)
            digitized = 5;
        else if (sig >= 0.6 && sig < 0.7)
            digitized = 6;
        else if (sig >= 0.7 && sig < 0.8)
            digitized = 7;
        else if (sig >= 0.8 && sig < 0.9)
            digitized = 8;
        else if (sig >= 0.9)
            digitized = 9;

        System.out.println("Sigmoid value: " + sig + "\nDigitized value: " + digitized);
        return digitized;
    }

    /**
     * 1. Provide inputs and "known" answer
     * 2. Guess according to the inputs using feedforward(inputs)
     * 3. Compute the error
     * 4. Adjust all weights according to the error and learning rate
     */
    public void train(int[][] bin, int desired)
    {
        int guess = feedforward(bin);
        int error = desired-guess;

        for (int i = 0; i < weights.length; i++)
        {
            for (int j = 0; j < weights[i].length; j++)
                weights[i][j] += LEARNING_RATE * error * bin[i][j];
        }
    }

}

I use a different class to "train" the neuron. This other class–TrainingConsole.java–basically takes "training.txt" with randomly generated components, feeds it training examples (images --> binary 2D arrays), and adjusts the weights based on error, the learning rate, and the corresponding value for bin:

   import java.awt.image.BufferedImage;
import java.io.*;
import java.util.Arrays;
import java.util.Scanner;

import javax.imageio.ImageIO;

public class TrainingConsole
{

    private File folder;
    private File data;

    public TrainingConsole(String dataFileName, String folderName)
    {
        data = new File(dataFileName);
        folder = new File(folderName);
    }

    public void changeFolder(String folderName)
    {
        folder = new File(folderName);
    }

    public void feedAll(int desired)
    {
        System.out.println(Arrays.toString(folder.listFiles()));
        for (int i = 1; i < folder.listFiles().length; i++) //To exclude folder
        {
            BufferedImage img = new BufferedImage(Drawing.DEF_WIDTH,Drawing.DEF_HEIGHT,BufferedImage.TYPE_INT_RGB);
            try
            {

                String name = folder.listFiles()[i].getName();
                if (name.substring(name.length()-4).equals(".png"))
                    img = ImageIO.read(folder.listFiles()[i]);
            }
            catch(IOException e)
            {System.out.println("Error?");}

            int[][] bin = new int[Drawing.DEF_WIDTH][Drawing.DEF_HEIGHT];

            if (img != null)
            {
                for (int y = 0; y < img.getHeight(); y++)
                {
                    for (int x = 0; x < img.getWidth(); x++)
                    {
                        int rgb = img.getRGB(x,y);
                        //System.out.println(rgb);
                        if (rgb == -1) //White
                            bin[y][x] = 0;
                        else
                            bin[y][x] = 1;
                    }
                }
                for (int[] a : bin)
                    System.out.println(Arrays.toString(a));
                train(bin,desired);
            }
        }
    }

     public void train(int[][] bin, int desired) {
         int guess = feedforward(bin);
         int error = desired - guess;

         Scanner input = null;
         try {
             input = new Scanner(data);
         } catch (FileNotFoundException e) {
             System.exit(1);
         }
         double[][] weights = new double[Drawing.DEF_HEIGHT][Drawing.DEF_WIDTH];
         int i = 0, j = 0;
         while (input.hasNext() && i < Drawing.DEF_HEIGHT) {
             weights[i][j] = input.nextDouble();
             j++;
             if (j > weights[i].length - 1) {
                 i++;
                 j = 0;
             }
         }

         for (int k = 0; k < weights.length; k++) {
             for (int l = 0; l < weights[k].length; l++)
                 weights[k][l] += IMGNeuron.LEARNING_RATE * error * bin[k][l];
         }

         data = new File(data.getName());
         PrintWriter output = null;
         try {
             output = new PrintWriter(data);
         } catch (FileNotFoundException e) {
             System.out.println("Cannot find data");
         }
         for (int m = 0; m < weights.length; m++) {
             for (int n = 0; n < weights[m].length - 1; n++)
                 output.print(weights[m][n] + " ");
             output.print(weights[m][weights[m].length - 1]);
             output.println();
         }
         output.close();
     }

    public int feedforward(int[][] bin)
    {
        double sum = 0;

        Scanner input = null;
        try
        {
            input = new Scanner(data);
        }
        catch(FileNotFoundException e)
        {
            System.out.println("Could not locate data");
        }
        double[][] weights = new double[Drawing.DEF_HEIGHT][Drawing.DEF_WIDTH];
        int i = 0, j = 0;
        while (i < Drawing.DEF_HEIGHT && j < Drawing.DEF_WIDTH)
        {
            //System.out.println("( " + i + " , " + j + " )");
            weights[i][j] = input.nextDouble();
            j++;
            if (j > weights[i].length - 1)
            {
                i++;
                j = 0;
            }
        }

        for (int m = 0; m < weights.length; m++)
        {
            for (int n = 0; n < weights[m].length; n++)
                sum += weights[m][n] * bin[m][n];
        }
        return activate(sum);
    }

    public int activate(double n)
    {
        double sig = 1.0/(1+Math.exp(-1*n));
        int digitized = 0;

        if (sig < 0.1)
            digitized = 0;
        else if (sig >= 0.1 && sig < 0.2)
            digitized = 1;
        else if (sig >= 0.2 && sig < 0.3)
            digitized = 2;
        else if (sig >= 0.3 && sig < 0.4)
            digitized = 3;
        else if (sig >= 0.4 && sig < 0.5)
            digitized = 4;
        else if (sig >= 0.5 && sig < 0.6)
            digitized = 5;
        else if (sig >= 0.6 && sig < 0.7)
            digitized = 6;
        else if (sig >= 0.7 && sig < 0.8)
            digitized = 7;
        else if (sig >= 0.8 && sig < 0.9)
            digitized = 8;
        else if (sig >= 0.9)
            digitized = 9;

        return digitized;
    }

    public static void main(String[] args)
    {
        Scanner input = new Scanner(System.in);
        TrainingConsole trainer = new TrainingConsole("training.txt","Training_000");

        System.out.println("--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------");
        System.out.println("Training Console");
        System.out.println("--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------");

        for (int i = 0; i <= 9; i++) {
            //System.out.print("Folder with training data for desired = " + i + ", or enter \"skip\" to skip: ");
            //String folderName = input.nextLine().trim();
            String folderName = "Training_00" + i;
            //System.out.println(folderName);
            if (!folderName.toLowerCase().equals("skip"))
            {
                trainer.changeFolder(folderName);
//              System.out.print("Press enter to run: ");
//              String noReason = input.nextLine();
                trainer.feedAll(i);
            }
            System.out.println("----------------------------------------------------------------------------------------------------ava----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------");
        }
    }

}

For subsequent Neuron constructions, I pass "training.txt" as the weights matrix. However, this is evidently not working:enter image description here

Please help! I am really new to neural nets and machine learning. At this point, I do not know what I am doing wrong: do I need more training examples? do I implement a poor activation function? Any advice would be appreciated. Also, feel free to request additional code if needed.

  • Maybe I missed it, but I only see a single neuron in your network. That's going to perform really poorly regardless of the amount of training. You may want to read http://neuralnetworksanddeeplearning.com/chap1.html. It has an example that will be very similar to what you want to do. – Chill May 30 '16 at 18:39
  • also your usage of the sigmoid (int the activate function) is bad... thats not how you do multiclass – user2717954 May 30 '16 at 18:43

1 Answers1

0

As pointed out in the comments, there are two main problems, I will desribe them in a bit more detail.

  1. Your whole model is a single perceptron, meaning that you create a linear model from your input space (pixels) to classes (digits). This simply cannot work, it is not a neural network in a modern sense. A "modern" NN designed for image processing will consists of thousands of neurons, connected in layers with nonlinear activation in between, probably arranged in the form of convolutional kernels (as this is the state of the art erchitecture for image recognition).

  2. You are supposed to solve multiclass problem, yet you actually do ranking. In order to make NN which classifies to K classes you should have K output neurons, each would produce a signal interpreted as a "probability" (not in strict mathematical sense) of belonging to particular class, thus in order to classify - you would take arg max (number of neuron with highest value).

Once you fix these two important issues with the whole architecture you should start getting reasonable results, then the only missing parts are tweaking hyperparameters and getting more training data.

lejlot
  • 64,777
  • 8
  • 131
  • 164