Convolutional network, annealing and epochs #135

CherryGoose · 2019-01-17T07:21:26Z

Im trying to create a convolutional network. What am i doing wrong? it seems that there is no difference between training net with larger or smaller number of examples. Also can you tell me what kind of methods of training used for every type of network? I using your framework for research purposes and if you can give me references to papers or algorithms that you used that would be great.

net.AddLayer(new InputLayer(UserData[0].GetLength(0), 1, 1));

for (int i = 0; i < NumberOfHiddenLayers; i++)
{
  int size;
  if (UserData[0].GetLength(0) < NumberOfHiddenLayers)
  {
    size = UserData[0].GetLength(0);
  }
  else
  {
    size = UserData[0].GetLength(0) / NumberOfHiddenLayers;
  }

  if (size < 2)
    size = 2;

  net.AddLayer(new ConvLayer((UserData[0].GetLength(0) - i * size), 1, 1));
  net.AddLayer(new ReluLayer());
}

net.AddLayer(new ConvLayer(2, 1, 1));
net.AddLayer(new SoftmaxLayer(2));

cbovar · 2019-01-19T08:48:47Z

It seems that size can be computed outside the for loop
I understand your training accuracy doesn't get better when you provide more training data. Have you tried with a simpler network ? I'm not sure I understand the way you compute the kernel size of convolution layers
If you have a full code source that I could run, that would be easier for me to help you
The training algorithms (Sgd and Adam) are inspired by original implementation of ConvNetJS. You could look at https://arxiv.org/pdf/1609.04747.pdf

CherryGoose · 2019-01-19T09:10:10Z

I have an array of doubles that comes from processed features of subjects. right now im trying to run this code but the training accuracy is off.

SgdTrainer Tr = new SgdTrainer(net)
{
    LearningRate = 0.01,
    BatchSize = 500,
    L2Decay = 0.001,
    Momentum = 0.9
};
net.AddLayer(new InputLayer(28, 28, 1));
net.AddLayer(new ConvLayer(5, 5, 8) { Stride = 1, Pad = 2 });
net.AddLayer(new ReluLayer());
net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
net.AddLayer(new ConvLayer(5, 5, 16) { Stride = 1, Pad = 2 });
net.AddLayer(new ReluLayer());
net.AddLayer(new PoolLayer(3, 3) { Stride = 3 });
net.AddLayer(new FullyConnLayer(10));
net.AddLayer(new SoftmaxLayer(10));

double[] d = new double[12 * 63];
for (int k = 0; k < 10; k++)
{
    int count = 0;
    for (int i = 0; i < 12; i++)
    {
        for (int j = 0; j < UserDATA[k].GetLength(1); j++)
        {
            d[count] = UserDATA[k][i, j];
            count++;
        }
    }
    var x = BuilderInstance.Volume.From(d, new Shape(12, 63, 1));
    double[] z = new double[10];
    for (int t = 0; t < z.Length; t++)
    {
        z[t] = 0.0;
    }
    z[k] = 1.0;
    var zx = BuilderInstance.Volume.From(z, new Shape(1, 1, 10, 1));
    for (int g = 0; g < Convert.ToInt32(NumberOfTrainingSteps.Text); g++)
    {
        Tr.Train(x, zx); // train the network, specifying that x is class zero
    }
}

double[] ts = new double[12 * 63];
double[] testd = new double[12 * 63];

for (int k = 0; k < 10; k++)
{
    int count = 0;
    for (int i = 0; i < 12; i++)
    {
        for (int j = 0; j < UserDATA[k].GetLength(1); j++)
        {
            testd[count] = UserDATA[k][i, j];

            if (k == 0)
                ts[count] = UserDATA[k][i, j];
            count++;
        }
    }

    var x = BuilderInstance.Volume.From(testd, new Shape(12, 63, 1));

    var prob = net.Forward(x);
    TestCON.Text += "\r\n" + " " + k + "            " + prob.Get(k);
    TestCON.Text += "\r\n" + k + " cl 0 prob " + prob.Get(0);
}

it seems that NumberOfTrainingSteps does not give me any increase in accuracy. but that can be expected because im not feeding any new data to the network. Thing is, even if i do train it on other examples nothing changes. Also, what is the BatchSize in trainer responsible for? Also, as i understand the input layer size should correspond with the amount of data points i feed to the network i.e. 28x28x1 should take no more than 784 data points?

cbovar · 2019-01-19T09:49:19Z

You should present a different input every time you call Train method. It seems you call NumberOfTrainingSteps times Train with the same data. This will make the network forget previous data in the dataset.
BatchSize is used in the trainers to normalize the gradients. It seems the BatchSize information is duplicated: in the trainer and in the 4th input volume dimension. I think it is possible to get rid of the one on the trainer but haven't done it yet (it's some relics from original ConvNetJS implem)
You should feed the network during training and inference with volume of shape 28x28x1xBatchSize. With BatchSize = 1 it should take 784 data points exactly (it seems you feed less data than that)

CherryGoose · 2019-01-29T19:33:56Z

Can you tell me what learningRate, L2Decay, Momentum represent? Also, is it possible to use the same data samples to train the network? Do you have functions that mutate the weights(simulated annealing, freezing, evolution multidimensional optimisation) or functions that separate epochs in training of the network? Also, if i use different trainers to train the network on the same data samples will it change anything performance wise?

cbovar · 2019-02-02T05:41:35Z

The learning rate determines the size of the steps we take to reach a (local) minimum. Basically the gradients are multiplied by the learning rate before being used to update the parameters to optimize. (see here in the code)

L1Decay and L2Decay are supposed to be used for regularization. You've made me realize that I still haven't implemented them. So these parameters are useless. I will get rid of them in the meanwhile.

Momentum is a method that helps accelerate SGD. You can look at section 4.1 of https://arxiv.org/pdf/1609.04747.pdf. (see here in the code)

The functions that mutate weights are called Trainers in ConvNetSharp. SgdTrainer / AdamTrainer for ConvNetSharp.Core and SgdTrainer / AdamTrainer for ConverNetSharp.Flow

Using different trainers will impact the performance of the network: Some training algorithms are more adapted to some kind of tasks.

I am not sure I understand "functions that separate epochs in training of the network". If it's a function to split data set in training / testing / validating, there is no such function in this library.

CherryGoose · 2019-04-30T17:02:05Z

Thank you for information! I have another question: im testing network after every training step and having problems with probability output. 100 different test samples outputs the same probability. As i understand the output should be different with every new test sample. What may cause that? It seems to me that network forgets previous training data or i simply can't see errors in my code. Here is the code: Net net = new Net();

            AdamTrainer Trex = new AdamTrainer(net)
            {
                LearningRate = LR,
                BatchSize=1
            };

        net.AddLayer(new InputLayer(999, 705, 1));
        net.AddLayer(new ConvLayer(11, 11, 5) { Stride = 1, Pad = 2 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
        net.AddLayer(new ConvLayer(5, 5, 16) { Stride = 1, Pad = 2 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
        net.AddLayer(new ConvLayer(3, 3, 20) { Stride = 1, Pad = 2 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
        net.AddLayer(new ConvLayer(2, 2, 30) { Stride = 1, Pad = 1 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new FullyConnLayer(2));
        net.AddLayer(new SoftmaxLayer(2));




        for (int yh = 0; yh < 100; yh++)
        {
            var x = BuilderInstance.Volume.From(TrueSamp[yh], new Shape(999, 705, 1));
            var y = BuilderInstance.Volume.From(FalseSamp[yh], new Shape(999, 705, 1));

            var zx = BuilderInstance.Volume.From(new[] { 1.0, 0.0 }, new Shape(1, 1, 2, 1));
            var zy = BuilderInstance.Volume.From(new[] { 0.0, 1.0 }, new Shape(1, 1, 2, 1));

            Trex.Train(x, zx); // train the network, specifying that x is class 0
            avloss += Trex.Loss;
            loss += "\r\n" + Trex.Loss;
            Trex.Train(y, zy); // train the network, specifying that y is class 1
            avloss += Trex.Loss;
            loss += "\r\n" + Trex.Loss;
            Random rand = new Random();

            double[] truesamp = new double[shit];
            for (int i = 0; i < 100; i++)
            {
                truesamp = TrueSampTest[rand.Next(0, 100)];
                var rq = BuilderInstance.Volume.From(truesamp, new Shape(999 * 705, 1, 1));
                var probq = net.Forward(rq);
                trueoutput += "\r\n" + probq.Get(0);
            }
            trueoutput += "yh ="+yh ;
            for (int i = 0; i < 100; i++)
            {
                double[] falsesamples = new double[shit];
                falsesamples = FalseSampTest[rand.Next(0, 100)];
                var rx = BuilderInstance.Volume.From(falsesamples, new Shape(999 * 705, 1, 1));
                var proby = net.Forward(rx);
                falseoutput += "\r\n" + +proby.Get(0);
            }
            falseoutput += "yh =" + yh;
        }

cbovar · 2019-04-30T20:21:46Z

Does the loss decrease?
It should output the same proba when yh is low but should not when yh start to grow.

CherryGoose · 2019-05-01T06:17:38Z

loss decreases as it should. According to the paper you сited.

cbovar · 2019-05-01T08:46:06Z

The input shape you use for testing seems odd: new Shape (999 * 705, 1, 1) instead of new shape(999, 705, 1). I'm not sure that's the source of the problem but it'd be interesting to fix that.

cbovar · 2019-05-05T00:17:54Z

Also, could you try decreasing the learning rate and post a new plot the loss? Maybe divide it by 10.

CherryGoose · 2019-05-05T13:39:45Z

I've changed the shape in testing method. No changes in proba occurred. Also tried decreasing learning rate, here is the plot.

cbovar · 2019-05-05T22:14:22Z

What is the value of LR?

Any chance to have the full code so I can run it ? I think I just need FalseSamp, TrueSamp, FalseSampTest, TrueSampTest

CherryGoose · 2019-05-06T06:27:27Z

Right now LR is 0.001. Here is the main code. Im reading data from files like the one i attached. Each line in file is a set of coordinates with corresponding value. One file is one training sample. I've changed architecture of layers a bit as it gives slightly better results.

1.zip

Net net = new Net();
string[] arr = Directory.GetFiles("C:\Users\USER\Desktop\Norm_podp\YST", ".");
string[] arrTest = Directory.GetFiles("C:\Users\USER\Desktop\BaS_PFS\NORM", ".");
string[] arrSig = Directory.GetFiles("C:\Users\USER\Desktop\Norm_podp\NORM", ".");

    double[,,] parce = new double[200, 1000, 1000];
    double[,,] parceTest = new double[100, 1000, 1000];
    double[,,] parceSig = new double[200, 1000, 1000];


    double[][] TrueSamp = new double[200][];
    double[][] TrueSampTEst = new double[100][];
    double[][] FalseSamp = new double[200][];
    int dimentionSize = (999 * 705);

for (int k = 0; k < 200; k++)
{
for (int i = 0; i < 1000; i++)
{
for (int j = 0; j < 1000; j++)
{

                   parce[k, i, j] = -1;
                    if (k < 100)
                        parceTest[k, i, j] = -1;
                   parceSig[k, i, j] = -1;
              
                }
            }
        }

        int count = 0;
        foreach (string file in arr)
        {
            string[] text = System.IO.File.ReadAllLines(file);
            for (int i = 0; i < text.Length; i++)
            {
                string[] sp = text[i].Split(';');
                int x = Convert.ToInt32(sp[0]);
                int y = Convert.ToInt32(sp[1]);
                parce[count, x, y] = Convert.ToInt32(sp[2]);
            }
            count++;
        }
        count = 0;
        foreach (string file in arrTest)
        {
            string[] text = System.IO.File.ReadAllLines(file);

            for (int i = 0; i < text.Length; i++)
            {
                string[] sp = text[i].Split(';');
                int x = Convert.ToInt32(sp[0]);
                int y = Convert.ToInt32(sp[1]);
                parceTest[count, x, y] = Convert.ToInt32(sp[2]);
            }
            count++;
        }
       
        count = 0;
        foreach (string file in arrSig)
        {
            string[] text = System.IO.File.ReadAllLines(file);
            for (int i = 0; i < text.Length; i++)
            {
                string[] sp = text[i].Split(';');
                int x = Convert.ToInt32(sp[0]);
                int y = Convert.ToInt32(sp[1]);
                parceSig[count, x, y] = Convert.ToInt32(sp[2]);
            }
            count++;
        }
        TestCON.Text += "Files loaded \r\n";
        count = 0;




        int numberoftrainingex = 200; 
        double LR = 0.001;
       


        string trueoutput = "";
        string trueoutputfalsesamp = "";
        string falseoutput = "";
        string falseoutputfalsesamp = "";


        int com = 0;
        for (int k = 0; k < numberoftrainingex; k++)
        {
            TrueSamp[k] = new double[shit];
            if(k<100)
            TrueSampTEst[k] = new double[shit];
            FalseSamp[k] = new double[shit];
      }

        for (int k = 0; k < numberoftrainingex; k++)
        {
            for (int i = 0; i < 999; i++)
            {
                for (int j = 0; j < 705; j++)
                {
                    TrueSamp[k][com] = parce[k, i, j];
                    if (k < 100)
                        TrueSampTEst[k][com]= parceTest[k, i, j];
                    FalseSamp[k][com] = parceSig[k, i, j];
                     com++;
                }

            }
            com = 0;
        }

     
        
        var avloss = 0.0;
        string loss="";
        
           
            Net<double> net = new Net<double>();
     
            AdamTrainer Trex = new AdamTrainer(net)
            {
                LearningRate = LR,
                BatchSize=1
            };

        net.AddLayer(new InputLayer(999, 705, 1));
        net.AddLayer(new ConvLayer(11, 11, 5) { Stride = 1, Pad = 2 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new PoolLayer(3, 3) { Stride = 2 });
        net.AddLayer(new ConvLayer(5, 5, 16) { Stride = 1, Pad = 2 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new PoolLayer(3, 3) { Stride = 2 });
        net.AddLayer(new ConvLayer(3, 3, 40) { Stride = 1, Pad = 2 });
        net.AddLayer(new ReluLayer());
        net.AddLayer(new PoolLayer(3, 3) { Stride = 2 });
        net.AddLayer(new ConvLayer(3, 3, 80) { Stride = 1, Pad = 1 });
        net.AddLayer(new FullyConnLayer(2));
        net.AddLayer(new SoftmaxLayer(2));




        for (int yh = 0; yh < 100; yh++)
        {
            var x = BuilderInstance.Volume.From(TrueSamp[yh], new Shape(999, 705));
            var y = BuilderInstance.Volume.From(FalseSamp[yh], new Shape(999, 705));

            var zx = BuilderInstance.Volume.From(new[] { 1.0, 0.0 }, new Shape(1, 1, 2, 1));
            var zy = BuilderInstance.Volume.From(new[] { 0.0, 1.0 }, new Shape(1, 1, 2, 1));

            Trex.Train(x, zx); // train the network, specifying that x is class zero
            avloss += Trex.Loss;
            loss += "\r\n" + Trex.Loss;
     
            Trex.Train(y, zy); // train the network, specifying that x is class zero
            avloss += Trex.Loss;
            loss += "\r\n" + Trex.Loss;
        
            Random rand = new Random();
            double[] truesamp = new double[shit];
            truesamp = TrueSamp[yh];
            var rq = BuilderInstance.Volume.From(truesamp, new Shape(999 * 705));
            var probq = net.Forward(rq);
            trueoutput += "\r\n" + probq.Get(0);
            trueoutputfalsesamp += "\r\n" + probq.Get(1);

            double[] falsesamples = new double[shit];
            falsesamples = FalseSamp[yh];
            var rx = BuilderInstance.Volume.From(falsesamples, new Shape(999 * 705));
            var proby = net.Forward(rx);
            falseoutputfalsesamp += "\r\n" +proby.Get(0);
            falseoutput += "\r\n" +proby.Get(1);
                        
        }
        avloss= avloss/200;
        TestCON.Text += "av loss" + avloss;
        TestCON.Text += "loss by steps" + "\r\n" + loss;
        TestCON.Text += " test samples of class1 beeing class1" + "\r\n";
        TestCON.Text += trueoutput;
        TestCON.Text += " test samples of class1  beeing class2" + "\r\n";
        TestCON.Text += trueoutputfalsesamp;
        TestCON.Text += " test samples of class2 beeing class1" + "\r\n";
        TestCON.Text += falseoutput;
        TestCON.Text += "test samples of class2 beeing class2" + "\r\n";
        TestCON.Text += falseoutputfalsesamp;
        com = 0;
        TestCON.Text +="TEST 1";
        for (int j = 0; j < 100; j++)
        {
            Random rand = new Random();
            double[] truesamp = new double[shit];
            truesamp = TrueSamp[j+100];
            var rq = BuilderInstance.Volume.From(truesamp, new Shape(999 * 705));
            var probq = net.Forward(rq);
            TestCON.Text += "\r\n" + probq.Get(0);
        }
        TestCON.Text += "TEST 1 1";
        for (int j = 0; j < 100; j++)
        {
            Random rand = new Random();
            double[] truesamp = new double[shit];
            truesamp = TrueSamp[j+100];
            var rq = BuilderInstance.Volume.From(truesamp, new Shape(999 * 705));
            var probq = net.Forward(rq);
            TestCON.Text += "\r\n" + probq.Get(1);
        }
        TestCON.Text += "TEST 2";
        for (int j = 0; j < 100; j++)
        {
            Random rand = new Random();
            double[] truesamp = new double[shit];
            truesamp = TrueSampTEst[j];
            var rq = BuilderInstance.Volume.From(truesamp, new Shape(999 * 705));
            var probq = net.Forward(rq);
            TestCON.Text += "\r\n" + probq.Get(0);
        }
        TestCON.Text += "TEST 2 2";
        for (int j = 0; j < 100; j++)
        {
            Random rand = new Random();
            double[] truesamp = new double[shit];
            truesamp = TrueSampTEst[j];
            var rq = BuilderInstance.Volume.From(truesamp, new Shape(999 * 705));
            var probq = net.Forward(rq);
            TestCON.Text += "\r\n" + probq.Get(1);
        }

CherryGoose · 2019-05-06T06:31:50Z

Here is the loss plot and probability plot after each epoch

ren85 · 2019-07-14T22:34:56Z

I had simillar problems and net started working when I normalized input (0 - 1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convolutional network, annealing and epochs #135

Convolutional network, annealing and epochs #135

CherryGoose commented Jan 17, 2019 •

edited by cbovar

cbovar commented Jan 19, 2019

CherryGoose commented Jan 19, 2019 •

edited by cbovar

cbovar commented Jan 19, 2019

CherryGoose commented Jan 29, 2019

cbovar commented Feb 2, 2019

CherryGoose commented Apr 30, 2019

cbovar commented Apr 30, 2019

CherryGoose commented May 1, 2019 •

edited

cbovar commented May 1, 2019

cbovar commented May 5, 2019

CherryGoose commented May 5, 2019

cbovar commented May 5, 2019

CherryGoose commented May 6, 2019 •

edited

CherryGoose commented May 6, 2019

ren85 commented Jul 14, 2019

Convolutional network, annealing and epochs #135

Convolutional network, annealing and epochs #135

Comments

CherryGoose commented Jan 17, 2019 • edited by cbovar

cbovar commented Jan 19, 2019

CherryGoose commented Jan 19, 2019 • edited by cbovar

cbovar commented Jan 19, 2019

CherryGoose commented Jan 29, 2019

cbovar commented Feb 2, 2019

CherryGoose commented Apr 30, 2019

cbovar commented Apr 30, 2019

CherryGoose commented May 1, 2019 • edited

cbovar commented May 1, 2019

cbovar commented May 5, 2019

CherryGoose commented May 5, 2019

cbovar commented May 5, 2019

CherryGoose commented May 6, 2019 • edited

CherryGoose commented May 6, 2019

ren85 commented Jul 14, 2019

CherryGoose commented Jan 17, 2019 •

edited by cbovar

CherryGoose commented Jan 19, 2019 •

edited by cbovar

CherryGoose commented May 1, 2019 •

edited

CherryGoose commented May 6, 2019 •

edited