Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Int8 default ranges break when a bconv is followed by a normal conv. #421

Open
AdamHillier opened this issue Jul 2, 2020 · 5 comments
Open
Labels
bug Something isn't working

Comments

@AdamHillier
Copy link
Contributor

Observed behaviour

When converting this model...

model = tf.keras.models.Sequential([
    tf.keras.Input((32, 32, 3)),
    lq.layers.QuantConv2D(
        32,
        (3, 3),
        input_quantizer="ste_sign",
        kernel_quantizer="ste_sign",
        padding="same",
        pad_values=1.0,
        use_bias=False
    ),
    tf.keras.layers.Conv2D(32, (3, 3)),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))

...we obtain the following converted model, with extra dequantise and quantise nodes around the Conv2D:

image

Expected behaviour

We expect there to be no dequantise or quantise nodes in a converted model when the experimental_default_int8_range argument is used.

If the QuantConv2D is replaced by a normal Conv2D we get:

model = tf.keras.models.Sequential([
    tf.keras.Input((32, 32, 3)),
    tf.keras.layers.Conv2D(
        32, (3, 3), padding="same", use_bias=False
    ),
    tf.keras.layers.Conv2D(32, (3, 3)),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))

image

Similarly, if the Conv2D is replaced with a QuantConv2D we get:

model = tf.keras.models.Sequential([
    tf.keras.Input((32, 32, 3)),
    lq.layers.QuantConv2D(
        32,
        (3, 3),
        input_quantizer="ste_sign",
        kernel_quantizer="ste_sign",
        padding="same",
        pad_values=1.0,
        use_bias=False
    ),
    lq.layers.QuantConv2D(
        32,
        (3, 3),
        input_quantizer="ste_sign",
        kernel_quantizer="ste_sign",
        padding="same",
        pad_values=1.0,
        use_bias=False
    ),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))

image

So there is something specifically going wrong with the QuantConv2D > Conv2D combination.

@AdamHillier AdamHillier added the bug Something isn't working label Jul 2, 2020
@AdamHillier
Copy link
Contributor Author

Just to note, until we fix this in LCE this issue can be worked around by explitly adding TensorFlow fake quantise ops before and after the problematic layer that doesn't get converted properly.

E.g.:

model = tf.keras.models.Sequential([
    tf.keras.Input((32, 32, 3)),
    lq.layers.QuantConv2D(
        32,
        (3, 3),
        input_quantizer="ste_sign",
        kernel_quantizer="ste_sign",
        padding="same",
        pad_values=1.0,
        use_bias=False
    ),
    tf.keras.layers.Lambda(lambda x: tf.quantization.fake_quant_with_min_max_args(x, -3.0, 3.0)),
    tf.keras.layers.Conv2D(32, (3, 3)),
    tf.keras.layers.Lambda(lambda x: tf.quantization.fake_quant_with_min_max_args(x, -3.0, 3.0)),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))

image

(It may not be clear from the screenshot, but the converted model is entirely Int8.)

@lgeiger
Copy link
Member

lgeiger commented Jul 2, 2020

Looks like we are running into tensorflow/tensorflow#40055 here since the bias tensors are shared which prevents quantization of the 8bit BConv.
E.g. the following network with a non-zero bias results in the correct output:

model = tf.keras.models.Sequential([
    tf.keras.Input((32, 32, 3)),
    lq.layers.QuantConv2D(
        32,
        (3, 3),
        input_quantizer="ste_sign",
        kernel_quantizer="ste_sign",
        padding="same",
        pad_values=1.0,
        use_bias=False
    ),
    tf.keras.layers.Conv2D(32, (3, 3), bias_initializer="random_uniform"),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))

@Tombana
Copy link
Collaborator

Tombana commented Jul 3, 2020

the bias tensors are shared which prevents quantization of the 8bit BConv.

Just an extra note: if that's indeed the cause of the bug, then training the network (on any dataset, even random data) for just 1 epoch should also fix this, and will work on any type of network.

@AdamHillier
Copy link
Contributor Author

@lgeiger do you have any idea why a Conv2D combined with a Conv2D works though, wouldn't that run into the same issue of shared bias tensors?

@lgeiger
Copy link
Member

lgeiger commented Jul 3, 2020

@lgeiger do you have any idea why a Conv2D combined with a Conv2D works though, wouldn't that run into the same issue of shared bias tensors?

There it is not a problem since both biases will be converted to int32 and still can be shared. The reason it fails is because BConv2d expects float whereas Conv2d expects int32 which can't be satisfied with a share tensor. Note that this is only a problem with the default ranges pass and not with conversion of a trained network that doesn't rely on default ranges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants