Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling min/max gives malloc error #30726

Closed
Arsilla opened this issue Jan 6, 2020 · 6 comments · Fixed by #33693
Closed

Rolling min/max gives malloc error #30726

Arsilla opened this issue Jan 6, 2020 · 6 comments · Fixed by #33693
Labels
Performance Memory or execution speed performance Window rolling, ewma, expanding
Milestone

Comments

@Arsilla
Copy link

Arsilla commented Jan 6, 2020

Code Sample

import pandas as pd
import numpy as np
import skimage
from scipy import signal

for orient in [0, 1]:
    th = int(input_img.shape[orient] / 100)

    peaks, info = signal.find_peaks(1 - bw_img.mean(orient), prominence=.35, width=2)
    for pk, w in zip(peaks, info['widths']):
        w *= 2
        if orient == 0:
            sign = bw_img[:, pk]
        else:
            sign = bw_img[pk, :]
        sign = pd.Series(sign).rolling(th).max()

Problem description

The above snippet is part of a function called in my main script. Running this results in either a malloc: Incorrect checksum for freed object 0x7fbf626f1f30: probably modified after being freed. error or a segmentation fault.
The culprit appears to be the rolling().max() line, since commenting out the line fixes the issue, as does replacing .max() with .mean().

I can't seem to recreate the error running the above snippet alone, and I cannot figure out why. The input (bw_img) is just a 2D array (black and white image).

It might be related to this issue #25893 expect my memory doesn't seem to be leaking. The two variants I keep seeing seem to be a checksum failed after changing deallocated memory, or that an attempted change of deallocated memory is caught.

python version: 3.6.5 (also tested on 3.7.0)
pandas version 0.25.3 (also tested 0.24 and 0.23)

Below the stacktrace:

Process:               python3.6 [61410]
Path:                  /Users/USER/*/python3.6
Identifier:            python3.6
Version:               ???
Code Type:             X86-64 (Native)
Parent Process:        zsh [41537]
Responsible:           python3.6 [61410]
User ID:               305159407

Date/Time:             2020-01-06 09:43:30.365 +0100
OS Version:            Mac OS X 10.14.3 (18D109)
Report Version:        12
Bridge OS Version:     3.0 (14Y674)
Anonymous UUID:        842CB73B-82E5-7A43-1D47-0BCD9BFB56A9


Time Awake Since Boot: 5500 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Application Specific Information:
abort() called
python(61410,0x1134fe5c0) malloc: Incorrect checksum for freed object 0x7f8c83801610: probably modified after being freed.
Corrupt value: 0x28
 

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x00007fff5b5fb23e __pthread_kill + 10
1   libsystem_pthread.dylib       	0x00007fff5b6b1c1c pthread_kill + 285
2   libsystem_c.dylib             	0x00007fff5b5641c9 abort + 127
3   libsystem_malloc.dylib        	0x00007fff5b6736e2 malloc_vreport + 545
4   libsystem_malloc.dylib        	0x00007fff5b68786c malloc_zone_error + 184
5   libsystem_malloc.dylib        	0x00007fff5b670103 tiny_free_list_remove_ptr + 544
6   libsystem_malloc.dylib        	0x00007fff5b66daee tiny_free_no_lock + 933
7   libsystem_malloc.dylib        	0x00007fff5b66d631 free_tiny + 483
8   _multiarray_umath.cpython-36m-darwin.so	0x0000000109b3357d _buffer_clear_info + 109
9   _multiarray_umath.cpython-36m-darwin.so	0x0000000109b334ff _dealloc_cached_buffer_info + 79
10  _multiarray_umath.cpython-36m-darwin.so	0x0000000109ae5152 array_dealloc + 18
11  window.cpython-36m-darwin.so  	0x000000012c56a8a6 __pyx_fuse_9__pyx_f_6pandas_5_libs_6window__roll_min_max(tagPyArrayObject_fields*, long, long, _object*, _object*, int) + 3206
12  window.cpython-36m-darwin.so  	0x000000012c569619 __pyx_fuse_9__pyx_pw_6pandas_5_libs_6window_59roll_max(_object*, _object*, _object*) + 425
13  algos.cpython-36m-darwin.so   	0x000000012aa1a42c __pyx_FusedFunction_call + 812
14  python                        	0x0000000109410ae5 PyObject_Call + 101
15  python                        	0x00000001094eea1b _PyEval_EvalFrameDefault + 25787
16  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
17  python                        	0x00000001094f2aeb fast_function + 411
18  python                        	0x00000001094f1729 call_function + 553
19  python                        	0x00000001094ee694 _PyEval_EvalFrameDefault + 24884
20  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
21  python                        	0x00000001094f2aeb fast_function + 411
22  python                        	0x00000001094f1729 call_function + 553
23  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
24  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
25  python                        	0x00000001094f2ede _PyFunction_FastCallDict + 606
26  python                        	0x0000000109410cba _PyObject_FastCallDict + 202
27  python                        	0x0000000109410e6c _PyObject_Call_Prepend + 156
28  python                        	0x0000000109410ae5 PyObject_Call + 101
29  python                        	0x00000001094eea1b _PyEval_EvalFrameDefault + 25787
30  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
31  python                        	0x00000001094f2ede _PyFunction_FastCallDict + 606
32  python                        	0x0000000109410cba _PyObject_FastCallDict + 202
33  python                        	0x0000000109410e6c _PyObject_Call_Prepend + 156
34  python                        	0x0000000109410ae5 PyObject_Call + 101
35  python                        	0x00000001094eea1b _PyEval_EvalFrameDefault + 25787
36  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
37  python                        	0x00000001094f2aeb fast_function + 411
38  python                        	0x00000001094f1729 call_function + 553
39  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
40  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
41  python                        	0x00000001094f2aeb fast_function + 411
42  python                        	0x00000001094f1729 call_function + 553
43  python                        	0x00000001094ee694 _PyEval_EvalFrameDefault + 24884
44  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
45  python                        	0x00000001094f2aeb fast_function + 411
46  python                        	0x00000001094f1729 call_function + 553
47  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
48  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
49  python                        	0x00000001094e84d0 PyEval_EvalCode + 48
50  python                        	0x000000010952156e PyRun_FileExFlags + 174
51  python                        	0x0000000109520b1a PyRun_SimpleFileExFlags + 266
52  python                        	0x000000010953d8b6 Py_Main + 3542
53  python                        	0x0000000109405c78 main + 248
54  libdyld.dylib                 	0x00007fff5b4bbed9 start + 1

Thread 1:
0   libsystem_kernel.dylib        	0x00007fff5b5f87de __psynch_cvwait + 10
1   libsystem_pthread.dylib       	0x00007fff5b6b2593 _pthread_cond_wait + 724
2   python                        	0x0000000109539f1f PyThread_acquire_lock_timed + 351
3   python                        	0x000000010954099f acquire_timed + 111
4   python                        	0x000000010954071c lock_PyThread_acquire_lock + 44
5   python                        	0x000000010945fbfb _PyCFunction_FastCallDict + 475
6   python                        	0x00000001094f175a call_function + 602
7   python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
8   python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
9   python                        	0x00000001094f2aeb fast_function + 411
10  python                        	0x00000001094f1729 call_function + 553
11  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
12  python                        	0x00000001094f2356 _PyEval_EvalCodeWithName + 2902
13  python                        	0x00000001094f2aeb fast_function + 411
14  python                        	0x00000001094f1729 call_function + 553
15  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
16  python                        	0x00000001094f2b89 fast_function + 569
17  python                        	0x00000001094f1729 call_function + 553
18  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
19  python                        	0x00000001094f2b89 fast_function + 569
20  python                        	0x00000001094f1729 call_function + 553
21  python                        	0x00000001094ee608 _PyEval_EvalFrameDefault + 24744
22  python                        	0x00000001094f3069 _PyFunction_FastCallDict + 1001
23  python                        	0x0000000109410cba _PyObject_FastCallDict + 202
24  python                        	0x0000000109410e6c _PyObject_Call_Prepend + 156
25  python                        	0x0000000109410ae5 PyObject_Call + 101
26  python                        	0x0000000109541216 t_bootstrap + 70
27  libsystem_pthread.dylib       	0x00007fff5b6af305 _pthread_body + 126
28  libsystem_pthread.dylib       	0x00007fff5b6b226f _pthread_start + 70
29  libsystem_pthread.dylib       	0x00007fff5b6ae415 thread_start + 13

Thread 2:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 3:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 4:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 5:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 6:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 7:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 8:
0   libsystem_pthread.dylib       	0x00007fff5b6ae3f8 start_wqthread + 0
1   ???                           	0x0000000054485244 0 + 1414025796

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000000  rbx: 0x00000001134fe5c0  rcx: 0x00007ffee67f8168  rdx: 0x0000000000000000
  rdi: 0x0000000000000307  rsi: 0x0000000000000006  rbp: 0x00007ffee67f81a0  rsp: 0x00007ffee67f8168
   r8: 0x0000000000000000   r9: 0x00007ffee67f80c0  r10: 0x0000000000000000  r11: 0x0000000000000206
  r12: 0x0000000000000307  r13: 0x0000000111e4d000  r14: 0x0000000000000006  r15: 0x000000000000002d
  rip: 0x00007fff5b5fb23e  rfl: 0x0000000000000206  cr2: 0x00007fff8e27a188
  
Logical CPU:     0
Error Code:      0x02000148
Trap Number:     133


VM Region Summary:
ReadOnly portion of Libraries: Total=708.7M resident=0K(0%) swapped_out_or_unallocated=708.7M(100%)
Writable regions: Total=483.8M written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=483.8M(100%)
 
                                VIRTUAL   REGION 
REGION TYPE                        SIZE    COUNT (non-coalesced) 
===========                     =======  ======= 
Activity Tracing                   256K        2 
Dispatch continuations            16.0M        2 
Kernel Alloc Once                    8K        2 
MALLOC                           170.5M       33 
MALLOC guard page                   16K        5 
MALLOC_LARGE (reserved)            256K        3         reserved VM address space (unallocated)
STACK GUARD                         36K       10 
Stack                             24.6M       10 
VM_ALLOCATE                      102.3M      174 
VM_ALLOCATE (reserved)           160.0M        4         reserved VM address space (unallocated)
__DATA                            42.7M      669 
__FONT_DATA                          4K        2 
__LINKEDIT                       253.2M      312 
__TEXT                           455.5M      557 
__UNICODE                          564K        2 
shared memory                       12K        4 
===========                     =======  ======= 
TOTAL                              1.2G     1775 
TOTAL, minus reserved VM space     1.0G     1775 


@s-scherrer
Copy link
Contributor

In order to reproduce this, could you try to save the arguments passed to the last line (sign and th) to a file, and then try to run it only with these specific arguments?

@Arsilla
Copy link
Author

Arsilla commented Mar 9, 2020

Hi,

Sorry to get back to this so late.
I did try to run the function with only the arguments that seemed to make it fail. However the error didn't occur then. Also, when running the script on all input files, sometimes it would fail on the first image, sometimes on the third. I couldn't figure out which specific inputs it was that created the error.

@magratheaner
Copy link

magratheaner commented Apr 18, 2020

I have encountered the same issue with a time series with shape (1963583, 1). I wanted to compute hour-long windows for a month of data with a resolution of 1 second. (This may sound like a bad idea, and it is, but since rolling() does not offer a stride argument (see Issue #15354) the 'official' way is to compute 99.9% useless windows and throw the ones you don't need away).

Running df.rolling(3600).min() 50 times filled up the RAM bit by bit until the IPython kernel crashed or gave a MemoryError. df.rolling(3600).median() was no problem, memory usage stayed the same which suggests it's not a problem with sorting the window data.
I tried adding the arguments min(raw=True, engine='numba') to change the underlying implementation but it still crashed.

I hope this is enough info to reproduce the error, but I guess it is very dependent on the system after all.

pandas: 1.0.3
python: 3.7.7
ipython: 7.13.0

@s-scherrer
Copy link
Contributor

s-scherrer commented Apr 20, 2020

I can reproduce this behaviour with this code snippet:

import numpy as np
import pandas as pd
import psutil

df = pd.DataFrame(np.random.randn(int(1e7), 1))

for i in range(10):
    print(f"{i}: Memory usage: {psutil.virtual_memory()[2]}%") 
    df.rolling(3600).<operation>()

When using median or mean as operation, memory usage stayed constant:

0: Memory usage: 39.3%
1: Memory usage: 39.0%
2: Memory usage: 39.0%
3: Memory usage: 39.0%
4: Memory usage: 39.1%
5: Memory usage: 39.1%
6: Memory usage: 39.1%
7: Memory usage: 39.1%
8: Memory usage: 39.1%
9: Memory usage: 39.1%

With min or max grows considerably:

0: Memory usage: 39.1%
1: Memory usage: 41.7%
2: Memory usage: 44.4%
3: Memory usage: 47.0%
4: Memory usage: 49.6%
5: Memory usage: 52.2%
6: Memory usage: 54.9%
7: Memory usage: 57.5%
8: Memory usage: 60.1%
9: Memory usage: 62.8%

Output of pd.show_versions():

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.8.2.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.6.2-1-default
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.0.3
numpy            : 1.18.2
pytz             : 2019.3
dateutil         : 2.8.1
pip              : 20.0.2
setuptools       : 46.0.0
Cython           : 0.29.15
pytest           : 5.4.1
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.11.1
IPython          : 7.13.0
pandas_datareader: None
bs4              : None
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : None
matplotlib       : 3.2.0
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : 5.4.1
pyxlsb           : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : 1.2.0
xlwt             : None
xlsxwriter       : None
numba            : None

Edit:
I also tried to use a frequency as rolling window argument, like this:

import numpy as np
import pandas as pd
import psutil

index = pd.date_range('2020-01-01 00:00:00', '2020-04-01 00:00:00', freq='S')
data = np.random.randn(len(index), 1)
df = pd.DataFrame(data, index)

for i in range(5):
    print(f"{i}: Memory usage: {psutil.virtual_memory()[2]}%") 
    df.rolling('H').min()

In this case the memory does not grow, so using 'H' instead of 3600 would be a quick fix for you, @magratheaner.

0: Memory usage: 38.3%
1: Memory usage: 38.4%
2: Memory usage: 38.4%
3: Memory usage: 38.4%
4: Memory usage: 38.4%

This suggests that the issue is somewhere in _roll_min_max_fixed in pandas/_libs/window/aggregations.pyx.

@magratheaner
Copy link

@s-scherrer Perfect, thank you. 'H' seems to be working nicely

s-scherrer added a commit to s-scherrer/pandas that referenced this issue Apr 21, 2020
This fixes at least the reproducible part of pandas-dev#30726, however, I am not
totally sure what is going on here.
Tests have shown that there are two solutions that avoid growing memory
usage:

- pass memoryviews (float64_t[:]) instead of ndarray[float64_t]
- remove starti and endi as arguments to _roll_min_max_fixed

This commit implements both.
s-scherrer added a commit to s-scherrer/pandas that referenced this issue Apr 23, 2020
This fixes at least the reproducible part of pandas-dev#30726, however, I am not
totally sure what is going on here.
Tests have shown that there are two solutions that avoid growing memory
usage:

- pass memoryviews (float64_t[:]) instead of ndarray[float64_t]
- remove starti and endi as arguments to _roll_min_max_fixed

This commit implements both.
@jreback jreback added Performance Memory or execution speed performance Window rolling, ewma, expanding labels Apr 24, 2020
@jreback jreback added this to the 1.1 milestone Apr 24, 2020
@hroff-1902
Copy link

hroff-1902 commented Apr 26, 2020

Hi there,

I see you have a fix for this, but this issue is added to the 1.1 milestone, which is 1st of August. Are not you going to fix it till August or at August with the 1.1 release? If not and we'll have the fix sooner, what was the reason for adding it to 1.1 milestone?

Do you have a priority or severity mark? Why this (and some other similar issues, there are also duplicates, other people reported you same -- see #32266 for example) is not marked with high priority/urgent bug? Do you understand that many apps crash since the release of pandas 1.0 all around the world because many libs use rolling.min/max? I wonder how it's managed...

@simonjayhawkins simonjayhawkins modified the milestones: 1.1, 1.0.4 May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants