CLI for ANN #1254

arjunmenon · 2018-02-17T13:18:21Z

Hey
Are ANN classes available from the CLI? From the docs it isn't apparent.
How can we use it otherwise?

zoq · 2018-02-18T15:16:45Z

You are right there doesn't exist an CLI for the ANN classes. You would have to use it from within your C++ code. I think it would be great to have an executable for the network code, however since there is no single architecture it's somewhat complex to provide all the necessary settings and at the same time make it easy to be used.

arjunmenon · 2018-02-18T16:38:09Z

I am not proficient with C++. I was hoping to add a Ruby wrapper to the CLI. So two things,

Is there support planned for a Ruby binding
If I have to make an attempt, where do I start.

zoq · 2018-02-19T12:58:47Z

For more information about how to add bindings to other languages please take a look at: http://www.mlpack.org/docs/mlpack-git/doxygen/bindings.html, let us know if we should clarify anything.

rcurtin · 2018-02-20T13:45:36Z

Hi Arjun,

There isn't currently any planned support for Ruby. If you are interested in writing a Ruby binding generator, it would be great and @zoq has given a link to some useful documentation. But unfortunately proficiency with C++ is going to be necessary to write this binding generator, so you may want to also study C++ a little more closely before diving too deep.

desai-aditya · 2018-02-21T14:17:08Z

Hi rcurtin,

I was planning to work on this in small pieces. Could I first implement a CLI that achieves this -
https://github.com/mlpack/models/blob/master/Kaggle/DigitRecognizer/src/DigitRecognizer.cpp
and then later improve it?

rcurtin · 2018-02-21T14:25:47Z

We need to wait for @zoq's input also, but my opinion is that any CLI program needs the following functionality:

specify a neural network architecture arbitrarily (possibly by loading it from a YAML file or something? We should probably discuss that part)
train on some given training data, with the user able to specify the optimizer and optimizer parameters
evaluate a trained model on some given testing data
print performance measures on training/test data
load an existing model or save a newly trained model

So writing something for a digit recognizer might be a good warmup exercise, but anything that we merge into mlpack should at least have the requirements above, in my opinion.

akhandait · 2018-02-21T15:16:12Z

@zoq, since we don't have any CLI for ANN classes, do you think it will be a good idea to start with a CLI for fully connected ANN of required configuration with options for all other required parameters?
If yes, I would like to work on it.
@rcurtin I think it will be possible to incorporate all the other functionality mentioned above.
Maybe we could further develop it to support any neural network architecture.

desai-aditya · 2018-02-21T15:37:45Z

@rcurtin :
What do you mean by 'loading it from a YAML file or something'?
Also how about adding more activation functions?
why is leakyReLU in the layers directory?

zoq · 2018-02-21T18:29:40Z

I agree with @rcurtin, we should not merge anything that doesn't support:

specify a neural network architecture arbitrarily (possibly by loading it from a YAML file or something? We should probably discuss that part)

train on some given training data, with the user able to specify the optimizer and optimizer parameters

evaluate a trained model on some given testing data

print performance measures on training/test data

load an existing model or save a newly trained model

YAML for the config file sounds like a good idea to me, however parsing isn't as simple as e.g. csv and I don't like to include another dependency (perhaps boost spirit could be helpful).

What do you mean by 'loading it from a YAML file or something'?

Instead of specifying the architecture e.g. layers via command line parameters we use a config file, YAML is just the format, we could also use JSON or XML.

Also how about adding more activation functions?

If you have something in mind, we are open for suggestions.

why is leakyReLU in the layers directory?

The LeakyReLU layer doesn't follow the same interface as e.g. the TanH since it holds an extra parameter alpha.

rcurtin · 2018-02-21T18:33:41Z

YAML for the config file sounds like a good idea to me, however parsing isn't as simple as e.g. csv and I don't like to include another dependency (perhaps boost spirit could be helpful).

What do you mean by 'loading it from a YAML file or something'?

Instead of specifying the architecture e.g. layers via command line parameters we use a config file, YAML is just the format, we could also use JSON or XML.

Agreed, I want to avoid new dependencies also. It's also possible we could come up with a very simple custom format that we can parse ourselves. I do think boost::spirit could be used to parse YAML but it looks like implementing a YAML parser might be complex:

https://github.com/cierelabs/yaml_spirit

So maybe some plain text file is best?

linear 50 50
sigmoid
linear 50 50
sigmoid
linear 50 10
sigmoid

could be one way to do it for a 3-layer linear network with 50 hidden neurons in each hidden layer and 10 output neurons. I don't have much of a preference; personally I think we could do just about anything here and as long as it is well documented and simple enough then nobody will have a problem with it.

zoq · 2018-02-21T18:36:23Z

Agreed, that's super simple and easy to read.

akhandait · 2018-02-21T18:50:25Z

I really like this idea. Also, I have worked with neural networks before and would love to write a CLI for it.
I think it will be really good and neat to input the network configuration as a txt file.
Can I go ahead and start implementing this, keeping all the 5 requirements in mind? :)

akhandait · 2018-02-21T18:58:47Z

@desai-aditya Do you want to work on this as well? Because you did comment on this first.
It also doesn't seem to be a small project, will you like it if we worked on it together?

desai-aditya · 2018-02-21T19:04:19Z

Yes we can surely collaborate @akhandait . How do you suggest we split the work?

akhandait · 2018-02-21T19:24:24Z

Okay, so I think we will need a day or two just to get really comfortable with the ANN method. The next step according to me will be to come with a concrete model for how we will implement it. If you have already dived into it and are comfortable maybe you could look at the CLI bindings we already have and come up with a basic structure for our implementation.
After that maybe we can discuss how to split the work.
Let's keep each other updated here on this issue.
Let's do this!
@rcurtin @zoq If you feel like we could do this a better way, please tell.

rcurtin · 2018-02-21T19:43:38Z

This all sounds good to me. Probably we will have to flesh out the file configuration format somewhat, but if we can keep it of the form

<layer type> <input size> <output size> <other parameters...>

I think that would be simplest to parse and work with. Sometimes the input or output size may not be needed. It'll also be really important to document exactly what the format is so that people can assemble simple networks just by looking at the documentation.

akhandait · 2018-02-21T20:05:38Z

Yes,
@desai-aditya Documenting it well should be one of our top priorities.
Maybe we could also add some tutorials but that is thinking quite into the future.

desai-aditya · 2018-02-21T20:24:38Z

so help me if I forgot something -
first we'll just focus on linear models and then later add rnns
batch size: any int (-b)
train-test split ratio> : double between 0 and 1 (-r)
optimizer : type of optimizer: sgd , I dont know which others can fit need to read docs (-op)
step size of sgd : double (-s)
test file : filename (-t)
output file : filename (-o)
train file : filename (-i)
print performance : (-v)
num epochs : any int

and the following in network.conf file
for each layer
layer type : linear
input size : any int but the first layer must match the dimensions of the dataset
output size : any int

is this fine? @zoq, @rcurtin ,@akhandait

zoq · 2018-02-21T23:55:00Z

Looks good to me, and I agree let's start with the linear models. About the network.conf file, ideally we interpret everything after the layer name as parameter, like for the linear layer this is input and output size, but for the convolution layer, there are more parameter we could set.

We could easily split this up into two parts: writing the parser and writing the cli. The cli could just pretend the parser already works and work with some artificial settings.

desai-aditya · 2018-02-22T05:07:23Z

@akhandait , @zoq, @rcurtin - I think the parser will parse the file and return the values to the main CLI program. The values will then be used to make the model and train the dataset inside the CLI . There are some parameters that will be used in general that will be needed as arguments to the CLI. The parser should only parse the values and not compile the model. It could be that the user supplies the model too. This is what I understand. I have already started working on the CLI . @akhandait You can go ahead with the parser and join me as soon as you finish it.

akhandait · 2018-02-22T05:33:48Z

@desai-aditya Can you please have a look at the CLIs we already have. You can find them here:
src/mlpack/methods/<method_name>/<method_name_main.cpp>.
All of them do the following job:

Info about the program - to be shown when somebody uses --help.
Define all the parameters that the program will take from the terminal / python.
Throw appropriate errors/warnings for invalid parameters.
Pass the required parameters to a model of that method.
Run the model.
Return/store the required parameters/model.
Hope this helps. :)

desai-aditya · 2018-02-22T05:37:59Z

@akhandait Don't worry I have already taken a look at how CLI's work and experimented with them.
The only part where the trouble might start could be running and assembling the model since there's a huge variety of different ways in which it can be built. Maybe you could help me with that once you've built the parser. Also you'll need to tell me in what form the parser will return the input - the interface that is.

zoq · 2018-02-22T18:09:13Z

@desai-aditya is right, the parser will just parse the file that defines the structure and the CLI will build the model and run the model. So let's say we follow @rcurtin's idea of <layer type> <input size> <output size> <other parameters...> a simple example could be:

Linear 10 10
Sigmoid
Linear 10 5
Softmax

the cli builds the network based on the provided information:

FFN<NegativeLogLikelihood<> > model;
model.Add<Linear<> >(inputSizeA, outputSizeA);
model.Add<SigmoidLayer<> >();
model.Add<Linear<> >(inputSizeB, outputSizeB);
model.Add<LogSoftMax<> >();

and performs the requested action.

I hope this makes sense, let me know if I should clarify anything. A good starting point for the parser is: https://github.com/mlpack/mlpack/blob/master/src/mlpack/core/data/load_csv.hpp

akhandait · 2018-02-22T18:27:11Z

@desai-aditya it's good you have already started with the CLI program, I have my exams over the next week, I will still try to take as much time out as I can and work on the parser, I am also working on another issue, so I won't claim too many tasks as it will just delay necessary tasks.
When I am done with the parser and the other issue, I will be happy to help with the CLI program. :)

zoq · 2018-02-22T18:53:26Z

@akhandait don't worry, and best of luck with your exams.

Namrata96 · 2018-03-10T09:36:30Z

Hi @desai-aditya @akhandait , are you still working on this? If yes, is there any way I could help?

akhandait · 2018-03-11T13:35:03Z

Hi @Namrata96. Yeah, I have been a little slow with regard to this issue but am working on it. I think I will open a WIP pull request in coming days and maybe you can help me by reviewing it extensively.

desai-aditya · 2018-03-12T08:33:04Z

@akhandait @zoq I am extremely sorry for delay in replying. I had my exams in the past week. I am free now and will be continuing back on the CLI.
@akhandait Hows the parser working?
@Namrata96 As @akhandait said , you could help in reviewing the PR.

sreenikSS · 2019-03-22T20:09:18Z

I thought about it a bit more; it may be useful to think about using YAML instead of JSON first since it's a lot easier to handwrite YAML. But I am not too picky---both work, and we can also write other tools on top of it all that produce JSON/YAML from an "easier" representation.

@rcurtin you are right, YAML is more user-friendly but I have just finished the JSON implementation with a boost property tree storing it and categorising the information into a number of maps storing the data in a dictionary like format. It am all for YAML but I currently want to focus on making the whole thing work; I shall provide YAML support (not much code though) after the main part is done. What do you say?

Do you think the new http://www.mlpack.org/doc/mlpack-3.0.4/cli_documentation.html might be a better thing to do here? We could embed a full description of the language and format either in the Detailed documentation section of the binding, like this:

http://mlpack.org/doc/mlpack-3.0.4/cli_documentation.html#approx_kfn_detailed-documentation

Or have a couple excerpts of "common" usage, and then refer to a tutorial that's written somewhere else. What do you think?

I have tried following the format of the original documentation, but there are some exceptions (for better readability), for example the linear layer has the original documentation:
Linear(const size_t inSize, const size_t outSize)
But we have abstracted the inSize, so the user only needs to specify outSize, which is better expressed as:
"type": "linear", "units": 100
because "units" sounds more appropriate than "outSize". But these instances are less in number, so maintaining a separate set of documentation is probably not needed. A separate tutorial mentioning those instances should be sufficient.

I'd agree, I feel like we can just have some C++ code that, perhaps, extracts the allowed layer types from LayerTypes in src/mlpack/methods/ann/layer/layer_types.hpp. This might be worth thinking about a little bit more: given the template type LayerTypes, can we extract a list of the members, and then also extract related documentation for each of the parameters to that layer? Or will we have to maintain that separately?

You'll be glad to hear, that part is already done (the code, not the doc part though) and I have done it exactly as you have mentioned here. I am structuring it in such a way that bindings and other stuff would be much easier to add. Actually I am learning a lot (A lot actually!!) while building this CLI.

One more thing, regarding the documentation are we planning to participate in Google's newly announced Season of Docs this year?

rcurtin · 2019-03-24T20:27:00Z

It am all for YAML but I currently want to focus on making the whole thing work; I shall provide YAML support (not much code though) after the main part is done. What do you say?

That's fine with me. 👍

I have tried following the format of the original documentation, but there are some exceptions (for better readability), for example the linear layer has the original documentation:
Linear(const size_t inSize, const size_t outSize)
But we have abstracted the inSize, so the user only needs to specify outSize, which is better expressed as:
"type": "linear", "units": 100
because "units" sounds more appropriate than "outSize". But these instances are less in number, so maintaining a separate set of documentation is probably not needed. A separate tutorial mentioning those instances should be sufficient.

Ah, what I was suggesting here is automatically generating the documentation for each layer, as opposed to manually writing it. It may be complex to come up with a good solution for that.

You'll be glad to hear, that part is already done (the code, not the doc part though) and I have done it exactly as you have mentioned here. I am structuring it in such a way that bindings and other stuff would be much easier to add. Actually I am learning a lot (A lot actually!!) while building this CLI.

Great! If there is any part of the code I can explain, just let me know. There are a lot of complex pieces that fit together in complex ways. :)

One more thing, regarding the documentation are we planning to participate in Google's newly announced Season of Docs this year?

Not sure, I think there is some interest in it in the community, so we'll see. 👍 I think it is a cool program, the issue for me at least is always time. (That said I don't have to run the effort, so if there's critical mass to do it, I think we should!) :)

sreenikSS · 2019-03-26T18:22:24Z

Thanks. I shall let you know if I need any further clarification anywhere.

sreenikSS · 2019-04-04T19:19:34Z

@rcurtin the parser is almost done. There are a number of critical parameters that need to be addressed. I shall post them tomorrow (extremely tired right now :( )

The PR is over here:
#1837

sreenikSS · 2019-04-05T17:45:19Z

Critical comments on it:

The init type doesn't take the given parameters into account because it is passed as a template parameter but not as an argument. The fix is quite simple. I am a little tight on schedule but will fix it as soon as I find some time.
For model training and testing the code needs to be modified every time one switches from a classification problem to a regression problem. This can be solved by either taking a user input (when the CLI is complete) or by determining it from the last layer and given labels (i.e., the matrix 'trainY').
Not all features have been implemented yet. Additionally, some init types and layer types have been commented because my mlpack installation somehow ignored them so it won't find the required files and hence I cannot compile and test it.
Programs in general have multiple methods and they are called in sequence from a separate method (main() in most standalone applications). But here due to the absence of suitable return types, the functions call one another in a chain like fashion, i.e., say main() calls A(), A() calls B(), B() calls C() and so on. Maybe an inheritance based solution can help by having a superclass to all init types, optimizer types and loss types respectively just like LayerTypes is a superclass to all layers (does it already exist?)
The functions are named like getters, like getLossType() and getInitType() whereas they are simple void functions and are even part of the chain discussed in point 4. These names are given to keep the essence of these methods in mind and the ideal behaviour they could exhibit (if a suitable return type is available).
The style checks do not currently pass, I hope to update the code soon.

I will update this if something else comes to my mind.

mlpack-bot · 2019-07-19T02:26:17Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

kunakl07 · 2020-01-10T16:05:12Z

I would like to work on this issue. Can I take this issue?

bkmgit · 2020-07-12T13:25:03Z

This would be a useful addition. A CLI is already part of Neural Network Libraries

Is anybody still working on this?

RishabhGarg108 · 2020-12-01T13:02:45Z

Hey, I was recently looking at this issue. After reading the discussion here, I figured out that we need to do two things here -

A utility to parse the config file to build the model.
An actual CLI that would implement that model.

In my opinion, a simple text file would not suffice, because it is easier to think of the model in a structured way (json/xml)

Then, key question is "How do we parse that structured config file?"

@sreenikSS has almost completed the parser in #1837 , but it is done using boost::property_tree. Since, we are trying to reduce our dependence on boost libraries, I thought we need to do it in some other way. I discussed it on IRC with @zoq . He suggested me to use cereal for it. So, I gave it a shot and tried to see if that works out or not, because cereal can be used to save or load objects into different formats like json/xml.

So, to test it out, I created a small model and saved it in the following way.
FFN<> model;
model.Add<Linear<> >(20, 3);
model.Add<SigmoidLayer<> >();
model.Add<LogSoftMax<> >();

data::Save("model.json", "model", model, false);

The model.json generated looks like this. This looks pretty bad because this actually saves all the parameters and everything associated to the model. To use cereal, the user actually needs to make a config file similar to it, This is practically not possible for user to write it.

So, now we are back to the same question i.e. how do we parse config file.
We can do it two ways -

By writing our own json/xml parser. (which in my opinion would be a tedious task)
By using some external dependency. (which would increase external dependence)

So, I want to ask what should be a good way to approach this? Thanks.

zoq · 2020-12-01T13:11:04Z

What I had in my mind was to have some sort of config class that we could serialize and use it as an intermediate format to construct the model. Isn't that what the parse would do as well? I would expect that a config class something like:

class Config
{
 public:
   std::vector<struct{std::string name, inSize, outSize}> layerName;
}

Would look easy enough.

bkmgit · 2020-12-01T13:18:01Z

@RishabhGarg108 The following may be relevant:
https://www.khronos.org/nnef/
https://github.com/onnx/onnx/blob/master/docs/IR.md

RishabhGarg108 · 2020-12-01T17:42:22Z

@bkmgit , thanks for these resources. I will definitely look at them.
The ONNX thing looks great, because that way we can import models from all those other libraries that ONNX support.

I have no experience with parsing and stuff. So currently, I am trying and exploring various things, like what @zoq mentioned about a Config class etc.. I will also give ONNX a try and we will see what works best. 👍

RishabhGarg108 · 2020-12-01T18:14:06Z

@RishabhGarg108 The following may be relevant:
https://www.khronos.org/nnef/
https://github.com/onnx/onnx/blob/master/docs/IR.md

@bkmgit , I looked at both of these. It turns out that both of these tools are for interoperability and optimization of trained models across different frameworks and devices for making inferences. This is not exactly what we are looking for. Infact, all we want is just a simple way to define the architecture for an ann model and then be able to translate that into a mlpack ann model. I hope that makes sense.
Correct me if I overlooked something :D

rcurtin · 2021-12-28T13:47:39Z

We should leave this open---it is still important functionality we should add at some point.

Manthan-R-Sheth mentioned this issue Feb 27, 2018

ANN Input/Output Size #1267

Closed

rcurtin added the t: feature request label Mar 13, 2018

sreenikSS mentioned this issue Apr 5, 2019

Create a parser to parse the model from a json #1837

Closed

mlpack-bot bot added the s: stale label Jul 19, 2019

rcurtin added the s: keep open label Jul 19, 2019

mlpack-bot bot removed the s: stale label Jul 19, 2019

rcurtin mentioned this issue Mar 5, 2020

Add callback to monitor the training process mlpack/examples#55

Merged

bkmgit mentioned this issue Jul 20, 2020

"The future of mlpack", round two! #2524

Closed

conradsnicta added s: stale and removed s: keep open labels Dec 21, 2021

mlpack-bot bot closed this as completed Dec 28, 2021

rcurtin reopened this Dec 28, 2021

mlpack-bot bot removed the s: stale label Dec 28, 2021

rcurtin added the s: keep open label Dec 28, 2021

mlpack deleted a comment from ranjan2829 May 29, 2023

rcurtin mentioned this issue Oct 18, 2023

For ANN (c++) Remove Templates for OutputLayerType and InitializationRuleType #3542

Closed

mlpack deleted a comment from kakarrot95 Mar 1, 2024

mlpack deleted a comment from SANJITH-KUMAR-20 May 3, 2024

CLI for ANN #1254

CLI for ANN #1254

Comments

arjunmenon commented Feb 17, 2018

zoq commented Feb 18, 2018

arjunmenon commented Feb 18, 2018

zoq commented Feb 19, 2018

rcurtin commented Feb 20, 2018

desai-aditya commented Feb 21, 2018

rcurtin commented Feb 21, 2018

akhandait commented Feb 21, 2018 • edited

desai-aditya commented Feb 21, 2018 • edited

zoq commented Feb 21, 2018

rcurtin commented Feb 21, 2018

zoq commented Feb 21, 2018

akhandait commented Feb 21, 2018

akhandait commented Feb 21, 2018 • edited

desai-aditya commented Feb 21, 2018

akhandait commented Feb 21, 2018

rcurtin commented Feb 21, 2018

akhandait commented Feb 21, 2018

desai-aditya commented Feb 21, 2018 • edited

zoq commented Feb 21, 2018

desai-aditya commented Feb 22, 2018

akhandait commented Feb 22, 2018

desai-aditya commented Feb 22, 2018 • edited

zoq commented Feb 22, 2018

akhandait commented Feb 22, 2018 • edited

zoq commented Feb 22, 2018

Namrata96 commented Mar 10, 2018

akhandait commented Mar 11, 2018

desai-aditya commented Mar 12, 2018

sreenikSS commented Mar 22, 2019 • edited

rcurtin commented Mar 24, 2019

sreenikSS commented Mar 26, 2019

sreenikSS commented Apr 4, 2019

sreenikSS commented Apr 5, 2019

mlpack-bot bot commented Jul 19, 2019

kunakl07 commented Jan 10, 2020

bkmgit commented Jul 12, 2020 • edited

RishabhGarg108 commented Dec 1, 2020

zoq commented Dec 1, 2020

bkmgit commented Dec 1, 2020

RishabhGarg108 commented Dec 1, 2020

RishabhGarg108 commented Dec 1, 2020

rcurtin commented Dec 28, 2021

akhandait commented Feb 21, 2018 •

edited

desai-aditya commented Feb 21, 2018 •

edited

akhandait commented Feb 21, 2018 •

edited

desai-aditya commented Feb 21, 2018 •

edited

desai-aditya commented Feb 22, 2018 •

edited

akhandait commented Feb 22, 2018 •

edited

sreenikSS commented Mar 22, 2019 •

edited

bkmgit commented Jul 12, 2020 •

edited