Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime.timedelta not picklable if unpicklable == False #444

Open
jrobbins-LiveData opened this issue May 21, 2023 · 8 comments
Open

datetime.timedelta not picklable if unpicklable == False #444

jrobbins-LiveData opened this issue May 21, 2023 · 8 comments
Labels

Comments

@jrobbins-LiveData
Copy link

>>> import datetime
>>> import jsonpickle

>>> delta = datetime.timedelta(days=3)

>>> jsonpickle.encode(delta, unpicklable=True)
'{"py/reduce": [{"py/type": "datetime.timedelta"}, {"py/tuple": [3, 0, 0]}]}'

>>> jsonpickle.encode(delta, unpicklable=False)
'null'

But, for example, other "lossy" one-way encodings are supported with unpicklable True or False:

>>> tpl = (3, 0, 0)
>>> jsonpickle.encode(tpl, unpicklable=True)
'{"py/tuple": [3, 0, 0]}'
>>> jsonpickle.encode(tpl, unpicklable=False)
'[3, 0, 0]'
@Theelx
Copy link
Contributor

Theelx commented May 22, 2023

That's very interesting, I have no clue what's happening here. I'll try and investigate it soon. Is this crucial for your use-case, or can it wait a bit? I have a few other projects I'm working on plus I'm moving on Sunday, so I'm a bit busy currently.

@jrobbins-LiveData
Copy link
Author

It can definitely wait. I really appreciate your asking! Good luck with the move -- that's always a big project!!

@Theelx Theelx added the bug label May 23, 2023
@Theelx
Copy link
Contributor

Theelx commented May 23, 2023

It seems like it was an intentional choice by the maintainers before me to have circular objects become None when pickled with unpickleable=False:

if self.unpicklable:
    ...
else:
    max_reached = self._max_reached()
    in_cycle = _in_cycle(obj, self._objs, max_reached, False)
    if in_cycle:
        # A circular becomes None.
        return None

Found here
While I'd support changing the behavior to be different by default, I'm afraid this would have to wait for jsonpickle 4, as changing this would be breaking backwards compatibility. Does your use case for this require different behavior when unpickleable=False, and if so, would adding an option to the encoder to try and better represent circular objects be useful? The latter option could be backwards-compatible, and could come out with jsonpickle 3.1.0, which I was planning on releasing within a month or two.

@jrobbins-LiveData
Copy link
Author

jrobbins-LiveData commented May 23, 2023

I don't think datetime.timedelta is circular. I stepped through this code into the call to jsonpickle.encode()

from datetime import timedelta
import jsonpickle


def main():
    td = timedelta(seconds=1)
    print(f'{td=}')
    td_json_str = jsonpickle.encode(td, unpicklable=False)
    print(f'{td_json_str=}')


if __name__ == '__main__':
    main()

and the return None is here (line 664):

File: pickler.py
658:         # catchall return for data created above without a return
659:         # (e.g. __getnewargs__ is not supposed to be the end of the story)
660:         if data:
661:             return data
662: 
663:         self._pickle_warning(obj)
664:         return None

I'm wondering if perhaps the issue (no data at line 660) is that timedelta has __reduce__ and __reduce_ex__ but does not have __getstate__?

@jrobbins-LiveData
Copy link
Author

jrobbins-LiveData commented May 23, 2023

I don't understand why only the unpicklable=True case calls __reduce__, whereas unpicklable=False only calls __getstate__, but apparently that is how _flatten_obj_instance was written. Is there a design document or some history to the project which records that decision?

Once there's a release with the fix for #443, I intend to use unpicklable=True (in other words, I only turned to unpicklable=False to try to find a temporary workaround.) But I would have thought that the code to spelunk an object to record its state would perhaps share a common core for both states of unpicklable.

@Theelx
Copy link
Contributor

Theelx commented May 26, 2023

Unfortunately there is no official design document/history to know why these choices were made. I only started working on jsonpickle in late 2020, and jsonpickle started in 2009, so I don't really know the reasoning behind much of the code-base. Maybe @davvid can provide some insights?

My best guess is that since __reduce__ is generally used by the python cPickle module to serialize objects it otherwise couldn't, it has a usage tied to (de)serializing objects, where __getstate__ is used more for debugging purposes. I personally mainly use __getstate__ for debugging, and have never used __reduce__ for anything other than serialization, so maybe the original authors thought of it this way also.

@jrobbins-LiveData
Copy link
Author

I guess my main point is that there are builtin types, like timedelta that simply do not have a __getstate__ (before Python 3.11) but do have a __reduce__. Even with Python 3.11, it isn't clear to me what the purpose of the default object's __getstate__ is. For example

Python 3.11.3 (tags/v3.11.3:f3909b8, Apr  4 2023, 23:49:59) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import timedelta
>>> td = timedelta(seconds=3)
>>> td
datetime.timedelta(seconds=3)
>>> td.__getstate__()
>>> td.__reduce__()
(<class 'datetime.timedelta'>, (0, 3, 0))

It seems to me that jsonpickle should prefer to use __reduce__ if available, but I admit I'm not clear on what the role of these two functions are meant to be. I'm assuming there's some PEP or other document that explains this?

@GaNiziolek
Copy link

Maybe related #447

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants