Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLEAP can overload RAM when many instances detected #1635

Open
1 of 4 tasks
aperkes opened this issue Dec 13, 2023 · 2 comments
Open
1 of 4 tasks

SLEAP can overload RAM when many instances detected #1635

aperkes opened this issue Dec 13, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@aperkes
Copy link

aperkes commented Dec 13, 2023

Bug description

In short, SLEAP can easily overload RAM when the array of tracks becomes large. In my case, it is trying to pin a 34 GB object to memory, which completely freezes the system. This is particularly bad for long videos with noisy backgrounds, e.g., recording all day in a naturalistic environment (which is unfortunately the bread and butter of our lab). This has happened both on ubuntu and windows. I've run into this issue in other contexts in the past (see #1288), but the most recent issue is particularly bad because it completely locks up the system requiring a hard reset. After some messing around, I have found I am able to generally prevent this my limiting max_instances per frame, and looking back at the previous issues, I see that there is now a --tracking.max_tracks argument that should put a hard cap on the proliferation of tracks. Still I think my suggestions below might be worthwhile, given how frustrating it is to have your whole computer freeze, especially if you're working on a remote server.

Expected behaviour

Ideally, I would expect it to a) not need to use so much RAM that it would freeze the system and b) if it does, raise a warning and adjust or raise an error and close rather than crashing the whole computer.

If I understand correctly, sleap generates a dense array of tracks, so it can be very memory intensive for long videos with many tracklets. I understand there may be performance/dependency issues that make changing this difficult, but I wonder if it is possible to implement this as a sparse array to prevent size multiplication.
Barring that, it would be useful to add some memory controls so that SLEAP can fail gracefully if it is beginning to overload the system (e.g., attempting to generate an object that is bigger than either of the sticks of RAM). Resource management isn't something I understand super well though, so this might not be feasible.

Actual behaviour

When running inference on a 30 min video (25 fps), my computer suddenly froze. Looking back at the log, this is what it reported before it stopped (there are more logs, if you want them)

2023-12-12 21:36:59.476568: E tensorflow/stream_executor/cuda/cuda_driver.cc:794] failed to alloc 34357641216 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2023-12-12 21:36:59.477290: W ./tensorflow/core/common_runtime/device/device_host_allocator.h:46] could not allocate pinned host memory of size: 34357641216

Your personal set up

  • OS: Ubuntu 20.04
  • Version(s): SLEAP v1.3.0, Python 3.7.12
Environment packages
# packages in environment at /home/ammon/anaconda3/envs/sleap130:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
absl-py                   0.15.0                   pypi_0    pypi
aom                       3.5.0                h27087fc_0    conda-forge
astunparse                1.6.3                    pypi_0    pypi
attrs                     21.2.0                   pypi_0    pypi
backports-zoneinfo        0.2.1                    pypi_0    pypi
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.12.7            ha878542_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                4.2.4                    pypi_0    pypi
cattrs                    1.1.1                    pypi_0    pypi
certifi                   2021.10.8                pypi_0    pypi
charset-normalizer        2.0.12                   pypi_0    pypi
clang                     5.0                      pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
commonmark                0.9.1                    pypi_0    pypi
cuda-nvcc                 11.3.58              h2467b9f_0    nvidia
cudatoolkit               11.3.1               ha36c431_9    nvidia
cudnn                     8.2.1.32             h86fa8c9_0    conda-forge
cycler                    0.11.0                   pypi_0    pypi
efficientnet              1.0.0                    pypi_0    pypi
expat                     2.5.0                h27087fc_0    conda-forge
ffmpeg                    5.1.2           gpl_h8dda1f0_106    conda-forge
flatbuffers               1.12                     pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.38.0                   pypi_0    pypi
freetype                  2.12.1               hca18f0e_1    conda-forge
gast                      0.4.0                    pypi_0    pypi
geos                      3.9.1                h9c3ff4c_2    conda-forge
gettext                   0.21.1               h27087fc_0    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
gnutls                    3.7.8                hf3e180e_0    conda-forge
google-auth               1.35.0                   pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
grpcio                    1.44.0                   pypi_0    pypi
h5py                      3.1.0           nompi_py37h1e651dc_100    conda-forge
hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
hdmf                      3.5.2                    pypi_0    pypi
icu                       72.1                 hcb278e6_0    conda-forge
idna                      3.3                      pypi_0    pypi
image-classifiers         1.0.0                    pypi_0    pypi
imageio                   2.15.0                   pypi_0    pypi
imgaug                    0.4.0                    pypi_0    pypi
imgstore                  0.2.9                    pypi_0    pypi
importlib-metadata        4.11.1                   pypi_0    pypi
importlib-resources       5.12.0                   pypi_0    pypi
joblib                    1.2.0                    pypi_0    pypi
jpeg                      9e                   h0b41bf4_3    conda-forge
jsmin                     3.0.1                    pypi_0    pypi
jsonpickle                1.2                      pypi_0    pypi
jsonschema                4.17.3                   pypi_0    pypi
keras                     2.6.0                    pypi_0    pypi
keras-applications        1.0.8                    pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4                    pypi_0    pypi
krb5                      1.20.1               hf9c8cef_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
lerc                      3.0                  h9c3ff4c_0    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcurl                   7.87.0               h6312ad2_0    conda-forge
libdeflate                1.10                 h7f98852_0    conda-forge
libdrm                    2.4.114              h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libiconv                  1.17                 h166bdaf_0    conda-forge
libidn2                   2.3.4                h166bdaf_0    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libnghttp2                1.51.0               hdcd2b5c_0    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpciaccess              0.17                 h166bdaf_0    conda-forge
libpng                    1.6.39               h753d276_0    conda-forge
libsqlite                 3.40.0               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtasn1                  4.19.0               h166bdaf_0    conda-forge
libtiff                   4.3.0                h0fcbabc_4    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libva                     2.18.0               h0b41bf4_0    conda-forge
libvpx                    1.11.0               h9c3ff4c_3    conda-forge
libwebp-base              1.3.0                h0b41bf4_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libxml2                   2.10.3               hfdac1af_6    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
markdown                  3.3.6                    pypi_0    pypi
matplotlib                3.5.3                    pypi_0    pypi
ncurses                   6.3                  h27087fc_1    conda-forge
ndx-pose                  0.1.1                    pypi_0    pypi
nettle                    3.8.1                hc379101_1    conda-forge
networkx                  2.6.3                    pypi_0    pypi
nixio                     1.5.3                    pypi_0    pypi
numpy                     1.19.5           py37h3e96413_3    conda-forge
oauthlib                  3.2.0                    pypi_0    pypi
olefile                   0.46               pyh9f0ad1d_1    conda-forge
opencv-python             4.5.5.62                 pypi_0    pypi
opencv-python-headless    4.5.5.62                 pypi_0    pypi
openh264                  2.3.1                hcb278e6_2    conda-forge
openjpeg                  2.5.0                h7d73246_0    conda-forge
openssl                   1.1.1t               h0b41bf4_0    conda-forge
opt-einsum                3.3.0                    pypi_0    pypi
p11-kit                   0.24.1               hc5aa10d_0    conda-forge
packaging                 21.3                     pypi_0    pypi
pandas                    1.3.5            py37he8f5f7f_0    conda-forge
pillow                    8.4.0            py37h0f21c89_0    conda-forge
pip                       23.0.1             pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10                   pypi_0    pypi
protobuf                  3.19.4                   pypi_0    pypi
psutil                    5.9.4                    pypi_0    pypi
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pygments                  2.14.0                   pypi_0    pypi
pykalman                  0.9.5                    pypi_0    pypi
pynwb                     2.3.1                    pypi_0    pypi
pyparsing                 3.0.7                    pypi_0    pypi
pyrsistent                0.19.3                   pypi_0    pypi
pyside2                   5.14.1                   pypi_0    pypi
python                    3.7.12          hb7a2778_100_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-rapidjson          1.10                     pypi_0    pypi
python_abi                3.7                     3_cp37m    conda-forge
pytz                      2023.2             pyhd8ed1ab_0    conda-forge
pytz-deprecation-shim     0.1.0.post0              pypi_0    pypi
pywavelets                1.3.0                    pypi_0    pypi
pyzmq                     25.0.2                   pypi_0    pypi
qimage2ndarray            1.9.0                    pypi_0    pypi
qtpy                      2.3.0              pyhd8ed1ab_0    conda-forge
readline                  8.2                  h8228510_1    conda-forge
requests                  2.27.1                   pypi_0    pypi
requests-oauthlib         1.3.1                    pypi_0    pypi
rich                      10.16.1                  pypi_0    pypi
ruamel-yaml               0.17.21                  pypi_0    pypi
ruamel-yaml-clib          0.2.7                    pypi_0    pypi
scikit-image              0.19.3                   pypi_0    pypi
scikit-learn              1.0.2                    pypi_0    pypi
scikit-video              1.1.11                   pypi_0    pypi
scipy                     1.7.3            py37hf2a6cf1_0    conda-forge
seaborn                   0.12.2                   pypi_0    pypi
segmentation-models       1.0.1                    pypi_0    pypi
setuptools                59.8.0           py37h89c1867_1    conda-forge
setuptools-scm            6.3.2                    pypi_0    pypi
shapely                   1.7.1            py37h48c49eb_5    conda-forge
shiboken2                 5.14.1                   pypi_0    pypi
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sleap                     1.3.0                    pypi_0    pypi
sqlite                    3.40.0               h4ff8645_0    conda-forge
svt-av1                   1.4.1                hcb278e6_0    conda-forge
tensorboard               2.6.0                    pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tensorflow                2.6.3                    pypi_0    pypi
tensorflow-estimator      2.6.0                    pypi_0    pypi
tensorflow-hub            0.13.0                   pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
threadpoolctl             3.1.0                    pypi_0    pypi
tifffile                  2021.11.2                pypi_0    pypi
tk                        8.6.12               h27826a3_0    conda-forge
tomli                     2.0.1                    pypi_0    pypi
tqdm                      4.66.1                   pypi_0    pypi
typing-extensions         3.10.0.2                 pypi_0    pypi
tzdata                    2022.7                   pypi_0    pypi
tzlocal                   4.3                      pypi_0    pypi
urllib3                   1.26.8                   pypi_0    pypi
werkzeug                  2.0.3                    pypi_0    pypi
wheel                     0.40.0             pyhd8ed1ab_0    conda-forge
wrapt                     1.12.1                   pypi_0    pypi
x264                      1!164.3095           h166bdaf_2    conda-forge
x265                      3.5                  h924138e_3    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libx11               1.8.4                h0b41bf4_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxfixes            5.0.3             h7f98852_1004    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zipp                      3.7.0                    pypi_0    pypi
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h3eb15da_6    conda-forge
Logs
2023-12-12 21:06:18.445037: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -95 } dim { size: -96 } dim { size: -97 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -14 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -14 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "GPU" vendor: "NVIDIA" model: "NVIDIA GeForce RTX 3060" frequency: 1867 num_cores: 28 environment { key: "architecture" value: "8.6" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 2359296 shared_memory_size_per_multiprocessor: 102400 memory_size: 10033496064 bandwidth: 360048000 } outputs { dtype: DT_FLOAT shape { dim { size: -14 } dim { size: -98 } dim { size: -99 } dim { size: 1 } } }
2023-12-12 21:06:19.408565: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8201
2023-12-12 21:36:59.476568: E tensorflow/stream_executor/cuda/cuda_driver.cc:794] failed to alloc 34357641216 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2023-12-12 21:36:59.477290: W ./tensorflow/core/common_runtime/device/device_host_allocator.h:46] could not allocate pinned host memory of size: 34357641216

Screenshots

Here's a picture from the video that crashed it. Incidentally, this isn't even a video we need to process, the fish had already been removed days before but someone forgot to change the camera schedule. So you can see it's really a worst case scenario for many noisy, fish-like background detections.
Screen Shot 2023-12-13 at 10 01 04 AM

How to reproduce

If you'd like, I can share the video and sleap models that caused this. Here is the command I ran (from within a snakemake pipeline):
sleap-track -m {params.centered} -m {params.centroid} --peak_threshold 0.4 --tracking.tracker simple --tracking.similarity centroid --tracking.track_window 5 {input} -o snake/sleap/{wildcards.video}.predictions.slp 2>> {params.log};"

Since it happened, I've changed to setting tracking.target_instance_count to 8 (there are 4 fish, but I do some post processing to filter out bad detections), and it hasn't failed with that on, although I think it theoretically could if track assembly went badly, and last night I accidentally used the old command and froze my system again while working remotely, so I wrote this up while waiting for someone to get to the lab to reset it.

As always, I really appreciate everything all of you do to make this such an amazing package, over the break we are set to process thousands of fish days worth of data, thanks for making that possible.

@aperkes aperkes added the bug Something isn't working label Dec 13, 2023
@aperkes
Copy link
Author

aperkes commented Feb 12, 2024

Follow up (and not so sneaky bump)

I was able to at least prevent my computer crashing by using ulimit -v 28000000, this was stricter than it needed to be (some get killed by ulimit when they would have been able to run without eating all the RAM), but it at least prevented by computer from freezing up unexpectedly, but I still do not know how to run these in a way that produces useful output.

I tried using --tracking.max_tracks, but that doesn't seem to work? I set max tracks to 20 but still got 100s of tracks on a 2500 frame sample video.

for reference, here's the parameters used for max tracks:

│ 'predictor': 'TopDownPredictor',
│ 'sleap_version': '1.3.0',
│ 'platform': 'Linux-5.15.0-91-generic-x86_64-with-debian-bullseye-sid',
│ 'command': '/home/ammon/anaconda3/envs/sleap130/bin/sleap-track -m /data/sleapModels/leap.take2.centered_instance.403/ -m /data/sleapModels/leap.take2.centroid.403/ /home/ammon/Documents/Scripts/FishTrack/working_dir/pi19.2023.06.13.short.mp4 --peak_threshold 0.55 --tracking.similarity iou --tracking.match hungarian --tracking.tracker simple --tracking.target_instance_count 8 --tracking.pre_cull_to_target 1 --tracking.track_window 5 -o /home/ammon/Documents/Scripts/FishTrack/working_dir/pi19.2023.06.13.limited.slp --tracking.max_tracking 1 --tracking.max_tracks 20',

@aperkes
Copy link
Author

aperkes commented Feb 12, 2024

Another update, on updating to the more recent version of SLEAP (1.3.3) and using the --tracking.tracker simplemaxtracks input, now it works properly and (presumably) will not overflow memory anymore. I'll add more updates if I find anything else important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant