-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BNN taking longer time then full precision Network #336
Comments
Could you elaborate a bit on what you are doing? If possible, it would be good to post a minimal code sample that reproduces the issue.
Are you referring to time per epoch or step, or total training time? Could you elaborate on the time difference? |
For below segment of code in BNN example provided by you, I have made Integer and kernel_quantizer= None instead of ste_sign
I have used time.clock() and time.time() to measure total training time of running the code and Found BNN time is greater than Full precision. I just put time.clock() in the start and end of the program to get the total time of running the BNN and Full precision program. Code which I used to test- https://github.com/larq/larq/blob/master/docs/examples/mnist.ipynb |
What do you mean by "Integer" in this context?
What's the time difference? Larq (and TensorFlow) use fake quantization during training, thus run the calculations in
I agree training time is important, but the main goal is to train networks that can be run efficiently during inference, so an increase in training time is often unavoidable. |
For BNN total time taken from the start of the program(starting from dataset load) till model.fit is 184.05 second and for evaluation (model. evaluate) it took 2.41 sec.
Sorry for typo it was input_quantizer. Note- Time can vary slightly but the pattern is always the same (BNN taking longer then Full precision). Both for train and evaluate. |
I'm facing the same issue. I tried the simple models below just to see the speed and file size change and put aside the accuracy for the moment. # full-precision model
simplemodel=models.Sequential()
simplemodel.add(layers.Conv2D(32,(3,3),padding='same',input_shape=(32,32,3)))
simplemodel.add(layers.Flatten())
simplemodel.add(layers.Dense(10,activation='sigmoid'))
# binarized model
kwargs = dict(input_quantizer="ste_sign",
kernel_quantizer="ste_sign",
kernel_constraint="weight_clip",
use_bias=False)
simplemodelbnn=models.Sequential()
simplemodelbnn.add(lq.layers.QuantConv2D(32,3,kernel_quantizer="ste_sign",kernel_constraint="weight_clip", use_bias=False,input_shape=(32,32,3)))
simplemodelbnn.add(layers.Flatten())
simplemodelbnn.add(lq.layers.QuantDense(10, **kwargs,activation='sigmoid')) I ran both model on the CIFAR 10 datasets normalized to (0,1) and (-1,1), with the same compile and 2 epochs as an example. from what @lgeiger said, now I can understand the slower training for binarized model, but why is the inference also slower?
the difference is small in absolute value since it's ralatively a small dataset, but I tried several times, binarized model all ran slower. the running time is read from the model.fit and model.evaluate output. both per epoch and per step |
@susuhu Larq BNN inference is slower than full precision inference, because Tensorflow does not actually support binarized operations. To make it possible to train and evaluate BNNs, The speedup you're looking for can be obtained with the Larq Compute Engine, an inference engine based on Tensorflow Lite that does support binary operations and therefore is much faster than running a "fake" BNN in the python |
I was trying to compare Larq BNN and full precision ( by making Integer and kernel_quantizer= None). I found that time taken to run the program is more for BNN compare to Full precision. Is it ok?
Since time to train is an important parameter for the efficient network.
The text was updated successfully, but these errors were encountered: