Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant fastTrainer? #121

Open
adityakusupati opened this issue Aug 21, 2019 · 6 comments
Open

Redundant fastTrainer? #121

adityakusupati opened this issue Aug 21, 2019 · 6 comments

Comments

@adityakusupati
Copy link
Contributor

fastTrainer. and fastcell_example of the harsha/reorg branch seem to be out of date and need to be updated or removed.

@harsha-simhadri I think thes file needs to be removed once your fastmodel is robust.

@harsha-simhadri
Copy link
Collaborator

@adityakusupati The first link looks broken

@adityakusupati
Copy link
Contributor Author

@harsha-simhadri
Copy link
Collaborator

It was because of the reorg - https://github.com/microsoft/EdgeML/blob/harsha/reorg/pytorch/edgeml_pytorch/trainer/fastTrainer.py

@adityakusupati Can you test the fastmodel in PR 123, and see if you are happy. If everything you need is there, I will remove the fastTrainer and fastcell_example files.

@adityakusupati
Copy link
Contributor Author

@harsha-simhadri thanks for the response. I will do the needful and @SachinG007 will assist me.
I think we need to change the names of the new files (both the trainer and example) for Fastcell just to be consistent.

@adityakusupati
Copy link
Contributor Author

adityakusupati commented Aug 26, 2019

  • Arg parse should contain at least: dataDir, cellType, inputDims, hiddenDims, numEpochs, batchSize, learningRate, wRank, uRank, wSparsity, uSparsity, gate and update non-linearities, decayStep, decayRate and outputFile. Please follow one of thisor this for default values of the argparse.
  • Mean-var normalization
  • Create Model directory to save .npy models and .ckpt
  • Save Mean and Std in model directory
  • Save the command which has been executed in the model directory for future reference
  • Provide support for all the Custom RNN cells in rnn.py to be used as part of our trainer and example
  • Save params in .npy format for the best model in the trainer which is updated as the epochs go by.
  • Aggregate all the results in one txt file per cell type in the dataset directory. Use this as reference. This dump contains, model size, model accuracy and the absolute path to the model directory and is very useful for grid searches and budget tuning.
  • Complete dense training (for no sparsity)
  • IHT for the 2nd phase with interleaved fixed support training (generally 1 IHT step for 15 fixed support train batches)
  • 3rd phase with fixed support training
  • Print the final best model numbers and the current model number along with the model directory at the end of the training.
  • Please support my version of quantSigm along with others to ensure stable quantization.
  • Support batch_first=True and False for RNN unrolling.
  • Support tuple input to RNN in case the cell is LSTM
  • Make num_matrices a property so that people can access it outside for some other things.
  • Support multi-layer RNNcells for all the custom RNN cells in rnn.py
  • Support quantizeModels.py for the RNN Cell models.

I will add if I find any other thing, but this seems complete.

@metastableB
Copy link
Contributor

metastableB commented Aug 27, 2019

Just saw this comment. Adding my suggestions:

  • Making all the examples consistent (as much as possible) in terms of usage. This means including a notebook as well as the command lines argument based script for FastRNN. The notebook is a good thing to have as a reference as github renders the code and samples outputs online.
    The command line script is always useful. If you really want to incorporate 'trainig_config', please do so by merging it with argparse. This can be done by setting the default parameter of argparse to pick values from train_config as below:
parser.add_argument('-c', '--cell', type=str, default=train_config.CellProperties.cellType,
                        help='Choose between [FastGRNN, FastRNN, UGRNN' +
                        ', GRU, LSTM], default: FastGRNN')
  • The edgeml.graph is supposed to only contain the forward computation graph. This means that the sparcification methods need to move out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants