ANN Input/Output Size #1267

zoq · 2018-02-21T23:48:18Z

Currently, if we build a network we have to specify the input and output size for each layer, ideally a user should only specify the input/output size for the first layer, and only specify the output size for the upcoming layer. Here is a simple example:

FFN<NegativeLogLikelihood<> > model;
model.Add<Linear<> >(inputSizeA, outputSizeA);
model.Add<SigmoidLayer<> >();
model.Add<Linear<> >(inputSizeB, outputSizeB);
model.Add<LogSoftMax<> >();

which should transforms to this:

FFN<NegativeLogLikelihood<> > model;
model.Add<Linear<> >(inputSize, outputSize);
model.Add<SigmoidLayer<> >();
model.Add<Linear<> >(layerSize);
model.Add<LogSoftMax<> >();

There is something similar already in place to set the input and output width for the convolution operator:

mlpack/src/mlpack/methods/ann/ffn_impl.hpp

Lines 332 to 369 in af3c00b

    
             if (boost::apply_visitor(outputWidthVisitor, network.front()) != 0) 
        
             { 
        
               width = boost::apply_visitor(outputWidthVisitor, network.front()); 
        
             } 
        
             if (boost::apply_visitor(outputHeightVisitor, network.front()) != 0) 
        
             { 
        
               height = boost::apply_visitor(outputHeightVisitor, network.front()); 
        
             } 
        
           } 
        
           for (size_t i = 1; i < network.size(); ++i) 
        
           { 
        
             if (!reset) 
        
             { 
        
               // Set the input width. 
        
               boost::apply_visitor(SetInputWidthVisitor(width), network[i]); 
        
               // Set the input height. 
        
               boost::apply_visitor(SetInputHeightVisitor(height), network[i]); 
        
             } 
        
             boost::apply_visitor(ForwardVisitor(std::move(boost::apply_visitor( 
        
                 outputParameterVisitor, network[i - 1])), std::move( 
        
                 boost::apply_visitor(outputParameterVisitor, network[i]))), network[i]); 
        
             if (!reset) 
        
             { 
        
               // Get the output width. 
        
               if (boost::apply_visitor(outputWidthVisitor, network[i]) != 0) 
        
               { 
        
                 width = boost::apply_visitor(outputWidthVisitor, network[i]); 
        
               } 
        
               // Get the output height. 
        
               if (boost::apply_visitor(outputHeightVisitor, network[i]) != 0) 
        
               { 
        
                 height = boost::apply_visitor(outputHeightVisitor, network[i]);

What this code does is to read and set the input/output width in the first forward pass. So once we know the output size of the previous layer we know the input size of the current layer and so on.

A first step would be to look at the https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/visitor/output_height_visitor.hpp and https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/visitor/set_input_height_visitor_impl.hpp, which should be helpful to implement another visitor for the input/output size.

desai-aditya · 2018-02-22T05:10:30Z

Instead of getting this changed at the codebase , why not change it at the main program of CLI? I mean the user could directly specify the following in the network.conf file:
linear 50
linear 20
linear 100

which would essentially be transformed by the CLI to :
linear dataset dimension 50
linear 50 20
linear 20 100

akhandait · 2018-02-22T05:14:33Z

@desai-aditya I think it's very important to make changes in the codebase first. Not everybody will use the CLI. The changes should be universal so that even when we use this within c++, it should work.

zoq · 2018-02-22T18:11:28Z

I agree with @akhandait, this should be implemented in the ann codebase.

daiviknema · 2018-02-24T04:34:34Z

I'd like to take this please

zoq · 2018-02-24T21:19:23Z

I think @desai-aditya and @akhandait are already working on the issue.

akhandait · 2018-02-25T11:46:44Z

Sadly, it has been very difficult for me to take time out due to tighly scheduled exams.
@desai-aditya please leave a comment if you are working on this.
@PlantsAndBuildings maybe you could start on this if he doesn’t. Good luck.:)

daiviknema · 2018-02-26T09:09:03Z

@akhandait : Thanks for letting me know.

@zoq : should I wait for @desai-aditya ? Or should I go ahead and start working on this?

desai-aditya · 2018-02-26T09:22:19Z

@PlantsAndBuildings : you too can start working with me.
@zoq ,@akhandait : sorry for a late reply , I was sick. I'll get back to work right away.
@akhandait : best luck for exams.

zoq · 2018-02-26T20:12:24Z

@desai-aditya take all the time you need, we don't have to rush anything here.

@PlantsAndBuildings I think @akhandait was working on the parser and @akhandait on the cli?

Manthan-R-Sheth · 2018-02-27T11:55:00Z

can it be made clear as to who all are working on this issue. I am interested to work on this as well.
@PlantsAndBuildings @desai-aditya @akhandait .
I guess if @PlantsAndBuildings and @desai-aditya are taking up this issue, may be I can work on the #1254 (CLI for ANN) issue.

akhandait · 2018-02-27T12:00:33Z

I am working on CLI for ANN along with @desai-aditya.

Manthan-R-Sheth · 2018-02-27T12:09:51Z

@PlantsAndBuildings, can we collaborate over this issue then?

desai-aditya · 2018-02-27T13:07:24Z

I have completed nearly half of the CLI approximately .
I am also working on this issue simultaneously but anyone wanting to join is more than welcome.

@akhandait How's the parser going? Also if possible make it so that the input to the parser is a csv file in the same format instead of a space separated one. It's a lot easier to handle it that way.

Also this issue and CLI for ANN are very closely related. If this issue is solved the parser will need to be updated. So we'll have to see how things go regarding that.

Any inputs from @zoq and @rcurtin would be appreciated .

zoq · 2018-02-27T13:29:13Z

In the best case, the parser is agnostic of the number of parameters. Let say the layer is defined as:

LayerName(parameterA, parameterB, parameterC, parameterD)

the config would look something like

LayerName 10 20 30 40

if the layer requires only two parameter we just pass those two.

daiviknema · 2018-02-27T13:42:37Z

I think that there are already enough people wanting to work on this, I'll find another issue. @Manthan-R-Sheth please ask @desai-aditya if he needs help

desai-aditya · 2018-02-27T13:44:08Z

Also I need a little bit of help for compiling the main using make -
I have to put the main file in CMakeLists of the ann folder.
And where else? Also there's something called add_cli_executable() and add_python_binding() in the end of other CMakeLists files, do I have to do something with those?

Manthan-R-Sheth · 2018-02-27T14:24:56Z

Based on my understanding (correct me in case i am wrong),
add_cli_executable("name") adds a command line executable with the corresponding "name". This enables the command mlpack_name from CLI.
add_python_binding("name") builds a python binding for the corresponding "name".(builds a .pyx file for python to run).
These assume that the file containing the main() is in "name"_main.cpp and accordingly makes the executables/.pyx files.

So given that you are making CLI for ANN module, you will have to do add_cli_executable(ann) and make a ann_main.cpp where the main() function is defined that parses the input and builds the model.

desai-aditya · 2018-02-27T14:30:54Z

I mean other than whatever you described @Manthan-R-Sheth . I understand that but do we need to make any other changes to any other file? Also what about cmake changes?

Manthan-R-Sheth · 2018-02-27T15:38:50Z

I think, just addition of ann_main.cpp in the folder and add_cli_executable(ann) should do the work.

zoq · 2018-02-27T19:14:38Z

Yes, that should do the trick.

daiviknema · 2018-03-14T12:15:32Z

It has been a while since I've seen any activity on this issue - and no open PRs as well. Is someone working on this, or can I take this one up?

akhandait · 2018-03-14T12:30:01Z

@PlantsAndBuildings I am working on this. Even I saw the inactivity here and hence started working on this.

akhandait · 2018-03-31T16:24:14Z

If we implement a new visitor for setting the input size of the layers in the first forward pass, how do we deal with the constructor of the layer objects.
For example :
model.Add<Linear<> >(layerSize);
In this case, how do we construct the Linear object with only one parameter? Will this require code modifications for all the layer objects?

zoq · 2018-04-01T18:06:57Z

An easy option would be to implement another constructor for each layer that needs the input/output size.

akhandait · 2018-04-04T14:11:25Z

In the Linear layer implementation, or any other layer for that matter, the weights are initialized (size is set) in the constructor.

template <typename InputDataType, typename OutputDataType>
Linear<InputDataType, OutputDataType>::Linear(
    const size_t inSize,
    const size_t outSize) :
    inSize(inSize),
    outSize(outSize)
{
  weights.set_size(outSize * inSize + outSize, 1);
}

In our case, if we define a constructor which only takes the output size, how do we deal with this? For each layer, the input size will only be known during the first forward pass.

zoq · 2018-04-04T14:13:47Z

We could move the initialization in the Reset function which is called after the parameters are set.

akhandait · 2018-04-11T13:40:58Z

Sorry for being late, I implemented the required visitors but am facing some issues. The Reset function is called after the parameters are set but the inSize for the second layer onwards is set during the first forward pass. So, the Reset function ends up initializing weights with inSize as 0.
EDIT:
I got the solution, we can call the Reset function after setting the inSize in set_input_size_visitor.hpp. I will open a PR.

zoq · 2018-04-13T10:52:01Z

Thanks for looking into it.

akhandait · 2018-04-13T13:53:29Z

@zoq moving the initialization to the Reset function doesn't work. It throws error: addition: incompatible matrix dimensions: while training, the parameters are not getting initialized properly. We need another solution to initialize the weights because we have to initialize the weights of each layer before the first forward pass. Do you think we need a function which just initializes parameters without actually doing a forward pass?

mlpack-bot · 2019-02-18T19:58:40Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

zoq added P: minor t: feature request labels Feb 22, 2018

akhandait mentioned this issue Apr 14, 2018

Ann input/output size (WIP) #1366

Closed

zoq mentioned this issue Sep 18, 2018

Layer serialization #1508

Merged

rcurtin added c: methods and removed P: minor labels Jan 19, 2019

mlpack-bot bot added the s: stale label Feb 18, 2019

rcurtin added the s: keep open label Feb 18, 2019

mlpack-bot bot removed the s: stale label Feb 18, 2019

sreenikSS mentioned this issue Feb 24, 2019

CLI for ANN #1254

Open

rcurtin mentioned this issue Nov 1, 2019

Trouble replicating a PyTorch tutorial in mlpack #2048

Closed

conradsnicta added s: stale and removed s: keep open labels Jul 20, 2021

mlpack-bot bot closed this as completed Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ANN Input/Output Size #1267

ANN Input/Output Size #1267

zoq commented Feb 21, 2018

desai-aditya commented Feb 22, 2018 •

edited

akhandait commented Feb 22, 2018

zoq commented Feb 22, 2018

daiviknema commented Feb 24, 2018

zoq commented Feb 24, 2018

akhandait commented Feb 25, 2018

daiviknema commented Feb 26, 2018

desai-aditya commented Feb 26, 2018

zoq commented Feb 26, 2018

Manthan-R-Sheth commented Feb 27, 2018 •

edited

akhandait commented Feb 27, 2018

Manthan-R-Sheth commented Feb 27, 2018

desai-aditya commented Feb 27, 2018

zoq commented Feb 27, 2018

daiviknema commented Feb 27, 2018

desai-aditya commented Feb 27, 2018

Manthan-R-Sheth commented Feb 27, 2018 •

edited

desai-aditya commented Feb 27, 2018

Manthan-R-Sheth commented Feb 27, 2018 •

edited

zoq commented Feb 27, 2018

daiviknema commented Mar 14, 2018

akhandait commented Mar 14, 2018

akhandait commented Mar 31, 2018

zoq commented Apr 1, 2018

akhandait commented Apr 4, 2018

zoq commented Apr 4, 2018

akhandait commented Apr 11, 2018 •

edited

zoq commented Apr 13, 2018

akhandait commented Apr 13, 2018

mlpack-bot bot commented Feb 18, 2019

ANN Input/Output Size #1267

ANN Input/Output Size #1267

Comments

zoq commented Feb 21, 2018

desai-aditya commented Feb 22, 2018 • edited

akhandait commented Feb 22, 2018

zoq commented Feb 22, 2018

daiviknema commented Feb 24, 2018

zoq commented Feb 24, 2018

akhandait commented Feb 25, 2018

daiviknema commented Feb 26, 2018

desai-aditya commented Feb 26, 2018

zoq commented Feb 26, 2018

Manthan-R-Sheth commented Feb 27, 2018 • edited

akhandait commented Feb 27, 2018

Manthan-R-Sheth commented Feb 27, 2018

desai-aditya commented Feb 27, 2018

zoq commented Feb 27, 2018

daiviknema commented Feb 27, 2018

desai-aditya commented Feb 27, 2018

Manthan-R-Sheth commented Feb 27, 2018 • edited

desai-aditya commented Feb 27, 2018

Manthan-R-Sheth commented Feb 27, 2018 • edited

zoq commented Feb 27, 2018

daiviknema commented Mar 14, 2018

akhandait commented Mar 14, 2018

akhandait commented Mar 31, 2018

zoq commented Apr 1, 2018

akhandait commented Apr 4, 2018

zoq commented Apr 4, 2018

akhandait commented Apr 11, 2018 • edited

zoq commented Apr 13, 2018

akhandait commented Apr 13, 2018

mlpack-bot bot commented Feb 18, 2019

desai-aditya commented Feb 22, 2018 •

edited

Manthan-R-Sheth commented Feb 27, 2018 •

edited

Manthan-R-Sheth commented Feb 27, 2018 •

edited

Manthan-R-Sheth commented Feb 27, 2018 •

edited

akhandait commented Apr 11, 2018 •

edited