Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shape mismatch error using resnet #1770

Open
1 of 4 tasks
milesOIST opened this issue May 14, 2024 · 2 comments
Open
1 of 4 tasks

shape mismatch error using resnet #1770

milesOIST opened this issue May 14, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@milesOIST
Copy link

milesOIST commented May 14, 2024

Bug description

When running resnet with certain settings, get a mismatch of sizes error

Expected behaviour

Actual behaviour

Your personal set up

  • OS:
    Windows 10 pro
  • Version(s):
Environment packages
# paste output of `pip freeze` or `conda list` here

packages in environment at C:\Users\ONS\anaconda3\envs\sleap:

Name Version Build Channel

absl-py 0.15.0 pypi_0 pypi
aom 3.5.0 h63175ca_0 conda-forge
astunparse 1.6.3 pypi_0 pypi
attrs 21.2.0 pypi_0 pypi
backports-zoneinfo 0.2.1 pypi_0 pypi
bzip2 1.0.8 he774522_0
ca-certificates 2023.05.30 haa95532_0
cached-property 1.5.2 py_0
cachetools 4.2.4 pypi_0 pypi
cattrs 1.1.1 pypi_0 pypi
certifi 2021.10.8 pypi_0 pypi
charset-normalizer 2.0.12 pypi_0 pypi
clang 5.0 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
commonmark 0.9.1 pypi_0 pypi
cuda-nvcc 11.3.58 hb8d16a4_0 nvidia
cudatoolkit 11.3.1 h59b6b97_2
cudnn 8.2.1 cuda11.3_0
cycler 0.11.0 pypi_0 pypi
dav1d 1.2.1 h2bbff1b_0
efficientnet 1.0.0 pypi_0 pypi
expat 2.5.0 h63175ca_1 conda-forge
ffmpeg 5.1.2 gpl_he426399_111 conda-forge
flatbuffers 1.12 pypi_0 pypi
font-ttf-dejavu-sans-mono 2.37 hd3eb1b0_0
font-ttf-inconsolata 2.001 hcb22688_0
font-ttf-source-code-pro 2.030 hd3eb1b0_0
font-ttf-ubuntu 0.83 h8b1ccd4_0
fontconfig 2.14.2 hbde0cde_0 conda-forge
fonts-anaconda 1 h8fa9717_0
fonts-conda-ecosystem 1 hd3eb1b0_0
fonttools 4.38.0 pypi_0 pypi
freetype 2.12.1 ha860e81_0
gast 0.4.0 pypi_0 pypi
geos 3.9.1 h6c2663c_0
google-auth 1.35.0 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.44.0 pypi_0 pypi
h5py 3.1.0 nompi_py37h19fda09_100 conda-forge
hdf5 1.10.6 h1756f20_1
hdmf 3.5.2 pypi_0 pypi
icc_rt 2022.1.0 h6049295_2
idna 3.3 pypi_0 pypi
image-classifiers 1.0.0 pypi_0 pypi
imageio 2.15.0 pypi_0 pypi
imgaug 0.4.0 pypi_0 pypi
imgstore 0.2.9 pypi_0 pypi
importlib-metadata 4.11.1 pypi_0 pypi
importlib-resources 5.12.0 pypi_0 pypi
intel-openmp 2023.1.0 h59b6b97_46319
joblib 1.2.0 pypi_0 pypi
jpeg 9e h2bbff1b_1
jsmin 3.0.1 pypi_0 pypi
jsonpickle 1.2 pypi_0 pypi
jsonschema 4.17.3 pypi_0 pypi
keras 2.6.0 pypi_0 pypi
keras-applications 1.0.8 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.4.4 pypi_0 pypi
lcms2 2.12 h83e58a3_0
lerc 3.0 hd77b12b_0
libblas 3.9.0 17_win64_mkl conda-forge
libcblas 3.9.0 17_win64_mkl conda-forge
libdeflate 1.10 h8ffe710_0 conda-forge
libexpat 2.5.0 h63175ca_1 conda-forge
libiconv 1.17 h8ffe710_0 conda-forge
liblapack 3.9.0 17_win64_mkl conda-forge
libopus 1.3.1 h8ffe710_1 conda-forge
libpng 1.6.39 h8cc25b3_0
libtiff 4.3.0 hc4061b1_4 conda-forge
libxml2 2.11.4 hc3477c8_0 conda-forge
libzlib 1.2.13 hcfcfb64_5 conda-forge
m2w64-gcc-libgfortran 5.3.0 6 conda-forge
m2w64-gcc-libs 5.3.0 7 conda-forge
m2w64-gcc-libs-core 5.3.0 7 conda-forge
m2w64-gmp 6.1.0 2 conda-forge
m2w64-libwinpthread-git 5.0.0.4634.697f757 2 conda-forge
markdown 3.3.6 pypi_0 pypi
matplotlib 3.5.3 pypi_0 pypi
mkl 2022.1.0 h6a75c08_874 conda-forge
msys2-conda-epoch 20160418 1 conda-forge
ndx-pose 0.1.1 pypi_0 pypi
networkx 2.6.3 pypi_0 pypi
nixio 1.5.3 pypi_0 pypi
numpy 1.19.5 py37h4c2b6ed_3 conda-forge
oauthlib 3.2.0 pypi_0 pypi
olefile 0.46 py37_0
opencv-python 4.5.5.62 pypi_0 pypi
opencv-python-headless 4.5.5.62 pypi_0 pypi
openh264 2.3.1 h63175ca_2 conda-forge
openjpeg 2.4.0 h4fc8c34_0
openssl 3.0.9 h2bbff1b_0
opt-einsum 3.3.0 pypi_0 pypi
packaging 21.3 pyhd3eb1b0_0
pandas 1.3.5 py37h9386db6_0 conda-forge
pillow 8.4.0 py37hd7d9ad0_0 conda-forge
pip 23.1.2 pyhd8ed1ab_0 conda-forge
pkgutil-resolve-name 1.3.10 pypi_0 pypi
protobuf 4.22.1 pypi_0 pypi
psutil 5.9.4 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pygments 2.14.0 pypi_0 pypi
pykalman 0.9.5 pypi_0 pypi
pynwb 2.3.1 pypi_0 pypi
pyparsing 3.0.7 pypi_0 pypi
pyreadline 2.1 py37_1
pyrsistent 0.19.3 pypi_0 pypi
pyside2 5.14.1 pypi_0 pypi
python 3.7.12 h900ac77_100_cpython conda-forge
python-dateutil 2.8.2 pyhd3eb1b0_0
python-rapidjson 1.10 pypi_0 pypi
python_abi 3.7 3_cp37m conda-forge
pytz 2022.7 py37haa95532_0
pytz-deprecation-shim 0.1.0.post0 pypi_0 pypi
pywavelets 1.3.0 pypi_0 pypi
pyzmq 25.0.2 pypi_0 pypi
qimage2ndarray 1.9.0 pypi_0 pypi
qtpy 2.2.0 py37haa95532_0
requests 2.27.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rich 10.16.1 pypi_0 pypi
ruamel-yaml 0.17.21 pypi_0 pypi
ruamel-yaml-clib 0.2.7 pypi_0 pypi
scikit-image 0.19.3 pypi_0 pypi
scikit-learn 1.0.2 pypi_0 pypi
scikit-video 1.1.11 pypi_0 pypi
scipy 1.7.3 py37hb6553fb_0 conda-forge
seaborn 0.12.2 pypi_0 pypi
segmentation-models 1.0.1 pypi_0 pypi
setuptools 59.8.0 py37h03978a9_1 conda-forge
setuptools-scm 6.3.2 pypi_0 pypi
shapely 1.7.1 py37hc520ffa_5 conda-forge
shiboken2 5.14.1 pypi_0 pypi
six 1.15.0 py37haa95532_0
sleap 1.3.0 pypi_0 pypi
sqlite 3.41.2 h2bbff1b_0
svt-av1 1.4.1 h63175ca_0 conda-forge
tbb 2021.8.0 h59b6b97_0
tensorboard 2.6.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.6.3 pypi_0 pypi
tensorflow-estimator 2.6.0 pypi_0 pypi
tensorflow-hub 0.13.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
threadpoolctl 3.1.0 pypi_0 pypi
tifffile 2021.11.2 pypi_0 pypi
tk 8.6.12 h2bbff1b_0
tomli 2.0.1 pypi_0 pypi
typing-extensions 3.10.0.2 pypi_0 pypi
tzdata 2022.7 pypi_0 pypi
tzlocal 4.3 pypi_0 pypi
ucrt 10.0.20348.0 haa95532_0
urllib3 1.26.8 pypi_0 pypi
vc 14.2 h21ff451_1
vc14_runtime 14.34.31931 h5081d32_16 conda-forge
vs2015_runtime 14.34.31931 hed1258a_16 conda-forge
werkzeug 2.0.3 pypi_0 pypi
wheel 0.38.4 py37haa95532_0
wrapt 1.12.1 pypi_0 pypi
x264 1!164.3095 h8ffe710_2 conda-forge
x265 3.5 h2d74725_3 conda-forge
xz 5.2.6 h8d14728_0 conda-forge
zipp 3.7.0 pypi_0 pypi
zlib 1.2.13 hcfcfb64_5 conda-forge
zstd 1.5.2 h12be248_6 conda-forge

Logs
# paste relevant logs here, if any

INFO:sleap.nn.training:
INFO:sleap.nn.training:Auto-selected GPU 0 with 16183 MiB of free memory.
INFO:sleap.nn.training:Using GPU 0 for acceleration.
INFO:sleap.nn.training:Disabled GPU memory pre-allocation.
INFO:sleap.nn.training:System:
GPUs: 1/1 available
Device: /physical_device:GPU:0
Available: True
Initalized: False
Memory growth: True
INFO:sleap.nn.training:
INFO:sleap.nn.training:Initializing trainer...
INFO:sleap.nn.training:Loading training labels from: Z:/KuhnU/Miles-Kuhn/SLEAP/NewModel2.slp
INFO:sleap.nn.training:Creating training and validation splits from validation fraction: 0.1
INFO:sleap.nn.training: Splits: Training = 1416 / Validation = 157.
INFO:sleap.nn.training:Setting up for training...
INFO:sleap.nn.training:Setting up pipeline builders...
INFO:sleap.nn.training:Setting up model...
INFO:sleap.nn.training:Building test pipeline...
2024-05-14 20:16:14.066762: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-14 20:16:16.252042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13599 MB memory: -> device: 0, name: NVIDIA RTX A4000, pci bus id: 0000:01:00.0, compute capability: 8.6
2024-05-14 20:16:19.767318: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
INFO:sleap.nn.training:Loaded test example. [273.244s]
INFO:sleap.nn.training: Input shape: (432, 576, 1)
INFO:sleap.nn.training:Created Keras model.
INFO:sleap.nn.training: Backbone: ResNet152(upsampling_stack=UpsamplingStack(output_stride=2, upsampling_stride=2, transposed_conv=False, transposed_conv_filters=64, transposed_conv_filters_rate=1.0, transposed_conv_kernel_size=4, transposed_conv_batchnorm=True, make_skip_connection=False, skip_add=False, refine_convs=2, refine_convs_filters=64, refine_convs_filters_rate=1.0, refine_convs_batchnorm=True), features_output_stride=32, pretrained=True, frozen=True, skip_connections=False, model_name='resnet152', stack_configs=[{'filters': 64, 'blocks': 3, 'stride1': 1, 'name': 'conv2', 'dilation_rate': 1}, {'filters': 128, 'blocks': 8, 'stride1': 2, 'name': 'conv3', 'dilation_rate': 1}, {'filters': 256, 'blocks': 36, 'stride1': 2, 'name': 'conv4', 'dilation_rate': 1}, {'filters': 512, 'blocks': 3, 'stride1': 2, 'name': 'conv5', 'dilation_rate': 1}])
INFO:sleap.nn.training: Max stride: 32
INFO:sleap.nn.training: Parameters: 59,811,915
INFO:sleap.nn.training: Heads:
INFO:sleap.nn.training: [0] = SingleInstanceConfmapsHead(part_names=['eyelid top', 'eyelid bottom', 'nose right', 'nose left', 'spout', 'mouth lip top', 'mouth corner', 'paw right', 'paw left', 'tongue', 'mouth lip bottom'], sigma=2.5, output_stride=2, loss_weight=1.0)
INFO:sleap.nn.training: Outputs:
INFO:sleap.nn.training: [0] = KerasTensor(type_spec=TensorSpec(shape=(None, 224, 288, 11), dtype=tf.float32, name=None), name='SingleInstanceConfmapsHead/BiasAdd:0', description="created by layer 'SingleInstanceConfmapsHead'")
INFO:sleap.nn.training:Training from scratch
INFO:sleap.nn.training:Setting up data pipelines...
INFO:sleap.nn.training:Training set: n = 1416
INFO:sleap.nn.training:Validation set: n = 157
INFO:sleap.nn.training:Setting up optimization...
INFO:sleap.nn.training: OHKM enabled: HardKeypointMiningConfig(online_mining=True, hard_to_easy_ratio=2.0, min_hard_keypoints=2, max_hard_keypoints=None, loss_scale=5.0)
INFO:sleap.nn.training: Learning rate schedule: LearningRateScheduleConfig(reduce_on_plateau=True, reduction_factor=0.5, plateau_min_delta=1e-06, plateau_patience=5, plateau_cooldown=3, min_learning_rate=1e-08)
INFO:sleap.nn.training: Early stopping: EarlyStoppingConfig(stop_training_on_plateau=True, plateau_min_delta=1e-08, plateau_patience=15)
INFO:sleap.nn.training:Setting up outputs...
INFO:sleap.nn.callbacks:Training controller subscribed to: tcp://127.0.0.1:9000 (topic: )
INFO:sleap.nn.training: ZMQ controller subcribed to: tcp://127.0.0.1:9000
INFO:sleap.nn.callbacks:Progress reporter publishing on: tcp://127.0.0.1:9001 for: not_set
INFO:sleap.nn.training: ZMQ progress reporter publish on: tcp://127.0.0.1:9001
INFO:sleap.nn.training:Created run path: Z:/KuhnU/Miles-Kuhn/SLEAP\models\240514_201219.single_instance.n=1573
INFO:sleap.nn.training:Setting up visualization...
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\sleap\nn\inference.py:1177: UserWarning: Model input of shape (None, 432, 576, 1) does not divide evenly with output of shape (None, 224, 288, 11).
f"Model input of shape {model.inputs[input_ind].shape} does not divide "
INFO:sleap.nn.training:Finished trainer set up. [314.1s]
INFO:sleap.nn.training:Creating tf.data.Datasets for training data generation...
INFO:sleap.nn.training:Finished creating training datasets. [5796.8s]
INFO:sleap.nn.training:Starting training loop...
Epoch 1/500
Traceback (most recent call last):
File "C:\Users\ONS\anaconda3\envs\sleap\Scripts\sleap-train-script.py", line 33, in
sys.exit(load_entry_point('sleap==1.3.0', 'console_scripts', 'sleap-train')())
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\sleap\nn\training.py", line 2014, in main
trainer.train()
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\sleap\nn\training.py", line 943, in train
verbose=2,
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\keras\engine\training.py", line 1184, in fit
tmp_logs = self.train_function(iterator)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\def_function.py", line 885, in call
result = self._call(*args, **kwds)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\def_function.py", line 933, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\def_function.py", line 760, in _initialize
*args, **kwds))
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\function.py", line 3066, in _get_concrete_function_internal_garbage_collected
graph_function, _ = self._maybe_define_function(args, kwargs)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\function.py", line 3463, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\function.py", line 3308, in _create_graph_function
capture_by_value=self._capture_by_value),
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\func_graph.py", line 1007, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\eager\def_function.py", line 668, in wrapped_fn
out = weak_wrapped_fn().wrapped(*args, **kwds)
File "C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\func_graph.py", line 994, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\keras\engine\training.py:853 train_function  *
    return step_function(self, iterator)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\sleap\nn\training.py:303 loss_fn  *
    loss += loss_fn(y_gt, y_pr)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\keras\losses.py:141 __call__  **
    losses = call_fn(y_true, y_pred)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\keras\losses.py:245 call  **
    return ag_fn(y_true, y_pred, **self._fn_kwargs)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\util\dispatch.py:206 wrapper
    return target(*args, **kwargs)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\keras\losses.py:1204 mean_squared_error
    return backend.mean(tf.math.squared_difference(y_pred, y_true), axis=-1)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\ops\gen_math_ops.py:10514 squared_difference
    "SquaredDifference", x=x, y=y, name=name)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\op_def_library.py:750 _apply_op_helper
    attrs=attr_protos, op_def=op_def)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\func_graph.py:601 _create_op_internal
    compute_device)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\ops.py:3569 _create_op_internal
    op_def=op_def)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\ops.py:2042 __init__
    control_input_ops, op_def)
C:\Users\ONS\anaconda3\envs\sleap\lib\site-packages\tensorflow\python\framework\ops.py:1883 _create_c_op
    raise ValueError(str(e))

ValueError: Dimensions must be equal, but are 224 and 216 for '{{node loss_fn/mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](model/SingleInstanceConfmapsHead/BiasAdd, IteratorGetNext:1)' with input shapes: [15,224,288,11], [15,216,288,?].

INFO:sleap.nn.callbacks:Closing the reporter controller/context.
INFO:sleap.nn.callbacks:Closing the training controller socket/context.
Run Path: Z:/KuhnU/Miles-Kuhn/SLEAP\models\240514_201219.single_instance.n=1573

Screenshots

How to reproduce

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error
@milesOIST milesOIST added the bug Something isn't working label May 14, 2024
@eberrigan eberrigan self-assigned this May 14, 2024
@eberrigan
Copy link
Contributor

Hi @milesOIST,

Would you mind uploading a sleap package with your training data here so I can try replicating your issue?

Also, you mentioned certain settings. Which settings did you notice this error happening with?

Thanks!

Elizabeth

@eberrigan
Copy link
Contributor

This could be related to #1768.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants