Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

float16 quantization runs out of memory for LSTM model #1091

Open
Black3rror opened this issue Aug 30, 2023 · 3 comments
Open

float16 quantization runs out of memory for LSTM model #1091

Black3rror opened this issue Aug 30, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@Black3rror
Copy link

No matter the size of the LSTM model, converting it with float16 optimization runs out of memory.

Code to reproduce the issue
The code snippet to reproduce the issue on Google Colab
Code:

import numpy as np
import tensorflow as tf
import tensorflow_model_optimization as tfmot

def create_model():
  model = tf.keras.models.Sequential()

  # For the model to later get converted, batch_size and sequence_length should be fixed.
  # E.g., batch_input_shape=[None, 1] will throw an error.
  # This is just a limitation when using RNNs. E.g., for FC or CNN we can have batch_size=None
  model.add(tf.keras.layers.Embedding(
    input_dim=5,
    output_dim=1,
    batch_input_shape=[1, 1]
  ))

  model.add(tf.keras.layers.LSTM(
    units=1,
    return_sequences=False,
    stateful=False
  ))

  model.add(tf.keras.layers.Dense(5))

  return model

model = create_model()
model.summary()

model.save("/content/model/")

representative_data = np.random.randint(0, 5, (200, 1)).astype(np.float32)

def representative_dataset():
  for sample in representative_data:
    sample = np.expand_dims(sample, axis=0)     # batch_size = 1
    yield [sample]                              # set sample as first (and only) input of the model

# float16 quantization
converter = tf.lite.TFLiteConverter.from_saved_model("/content/model/")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
# kernel runs out of memory and crashes in the following line
tflite_quant_model = converter.convert()
@Black3rror Black3rror added the bug Something isn't working label Aug 30, 2023
@cdh4696
Copy link

cdh4696 commented Aug 31, 2023

@yyoon Could you please check? Thanks!

@malloyca
Copy link

malloyca commented Sep 14, 2023

I have also encountered this problem using TensorFlow version 12.2.1 on my system. Non-optimized conversion works fine with LSTM, but float16 optimization is causing my kernel to crash repeatedly.

@barrypitman
Copy link

Same problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants