Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I wish I could pickle pyecore objects #68

Open
paulhoule opened this issue Aug 22, 2019 · 5 comments
Open

I wish I could pickle pyecore objects #68

paulhoule opened this issue Aug 22, 2019 · 5 comments

Comments

@paulhoule
Copy link

I am working with the ISO 20022 e-repostitory which is an ecore file which is about 78 MB of uncompressed XMI.

It seems to work OK, but it takes about 60 seconds to import the XMI file.

It would be much easier to work with this interactively if I could pickle the model, but I have not been able to pickle the models using pickle, cloudpickle, or dill. I assume there is something strange being done by pyecore that is getting in the way.

It would be nice to be able to pickle or otherwise quickly serialize/deserialize models.

@aranega
Copy link
Member

aranega commented Aug 23, 2019

Hi @paulhoule ,

Indeed, it would be a terrific feature to be able to pickle PyEcore objects. It does not work out of the box, but I didn't investigate more why. Perhaps it's because of the fact that there is a lot of "cycles" in the graph and the relation to the metamodel from each instance.
I will investigate that, indeed, it could be a super feature to have this for PyEcore! Thanks a lot for the suggestion

@ewoudwerkman
Copy link

Hi,

I've been dealing with the same issue, as we had pyEcore objects in a Flask file-based session. The first appoach was to serialize it to a string by overriding __getstate__() and __setstate__():

    # Support for Pickling when serializing the energy system in a session
    # The pyEcore classes by default do not allow for simple serialization for Session management in Flask.
    # Internally Flask Sessions use Pickle to serialize a data structure by means of its __dict__. This does not work.
    # Furthermore, ESDL can contain cyclic relations. Therefore we serialize to XMI and back if necessary.
    def __getstate__(self):
        state = dict()
        state['energySystem'] = self.to_string();
        return state

    def __setstate__(self, state):
        self.__init__()
        self.load_from_string(state['energySystem'])

where to_string() is defined as:

   def to_string(self):
        # to use strings as resources, we simulate a string as being a URI
        uri = StringURI('to_string.esdl')
        self.resource.save(uri)
        # return the string
        return uri.getvalue()

This allows you to picle the root element, but not subelements. But of course this is slow.

I also needed support to duplicate a EObject. This is what I did:

# add support for shallow copying or cloning an object
        # it copies the object's attributes (e.g. clone an object), does only shallow copying
        def clone(self):
            """
            Shallow copying or cloning an object
            It only copies the object's attributes (e.g. clone an object)
            Usage object.clone() or copy.copy(object) (as _copy__() is also implemented)
            :param self:
            :return: A clone of the object
            """
            newone = type(self)()
            eclass = self.eClass
            for x in eclass.eAllStructuralFeatures():
                if isinstance(x, EAttribute):
                    newone.eSet(x.name, self.eGet(x.name))
            return newone

        setattr(EObject, '__copy__', clone)
        setattr(EObject, 'clone', clone)

Deep copying is not yet supported, but by iterating over the references and keeping a dict with which objects you have encountered already (for handling cyclic references) would do it I think.

Combining these two approaches (adding __getstate__() / __setstate__() to an EObject would make it work I think.

@aranega
Copy link
Member

aranega commented Oct 3, 2019

@ewoudwerkman Wow, thanks so much for all your tests and work. I will try to work soon on this (it would be really helpfull), based on your experiments. In the same time, few months ago, I created a clone function for the pygeppetto project that relies on pyecore : openworm/pygeppetto#26 (comment)

This function performs a deep copy, but only for containment references. I think it can be adapted to to perform a deep copy of the full graph.

@ewoudwerkman
Copy link

ewoudwerkman commented Feb 4, 2020

I needed deepcopy support including cross-references and did the following. Did only some small testing. Might be useful for others, or for inclusion in pyEcore:

# add support for shallow copying or cloning an object
# it copies the object's attributes (e.g. clone an object), does only shallow copying
def clone(self):
    """
    Shallow copying or cloning an object
    It only copies the object's attributes (e.g. clone an object)
    Usage object.clone() or copy.copy(object) (as _copy__() is also implemented)
    :param self:
    :return: A clone of the object
    """
    newone = type(self)()
    eclass = self.eClass
    for x in eclass.eAllStructuralFeatures():
        if isinstance(x, EAttribute):
            log.debug("clone: processing attribute {}".format(x.name))
            if x.many:
                eOrderedSet = newone.eGet(x.name)
                for v in self.eGet(x.name):
                    eOrderedSet.append(v)
            else:
                newone.eSet(x.name, self.eGet(x.name))
    return newone

setattr(EObject, '__copy__', clone)
setattr(EObject, 'clone', clone)

"""
Deep copying an EObject.
Does not work yet for copying references from other resources than this one.
"""
def deepcopy(self, memo=None):
    log.debug("deepcopy: processing {}".format(self))
    first_call = False
    if memo is None:
        memo = dict()
        first_call = True
    if self in memo:
        return memo[self]

    copy: EObject = self.clone()
    log.debug("Shallow copy: {}".format(copy))
    eclass: EClass = self.eClass
    for x in eclass.eAllStructuralFeatures():
        if isinstance(x, EReference):
            log.debug("deepcopy: processing reference {}".format(x.name))
            ref: EReference = x
            value: EStructuralFeature = self.eGet(ref)
            if value is None:
                continue
            if ref.containment:
                if ref.many and isinstance(value, EOrderedSet):
                    #clone all containment elements
                    eOrderedSet = copy.eGet(ref.name)
                    for ref_value in value:
                        duplicate = ref_value.__deepcopy__(memo)
                        eOrderedSet.append(duplicate)
                else:
                    copy.eSet(ref.name, value.__deepcopy__(memo))
            #else:
            #    # no containment relation, but a reference
            #    # this can only be done after a full copy
            #    pass
    # now copy should a full copy, but without cross references

    memo[self] = copy

    if first_call:
        log.debug("copying references")
        for k, v in memo.items():
            eclass: EClass = k.eClass
            for x in eclass.eAllStructuralFeatures():
                if isinstance(x, EReference):
                    log.debug("deepcopy: processing x-reference {}".format(x.name))
                    ref: EReference = x
                    orig_value: EStructuralFeature = k.eGet(ref)
                    if orig_value is None:
                        continue
                    if not ref.containment:
                        opposite = ref.eOpposite
                        if opposite and opposite.containment:
                            # do not handle eOpposite relations, they are handled automatically in pyEcore
                            continue
                        if x.many:
                            eOrderedSet = v.eGet(ref.name)
                            for orig_ref_value in orig_value:
                                try:
                                    copy_ref_value = memo[orig_ref_value]
                                except KeyError:
                                    log.warning(f'Cannot find reference of type {orig_ref_value.eClass.Name} \
                                        for reference {k.eClass.name}.{ref.name} in deepcopy memo, using original')
                                    copy_ref_value = orig_ref_value
                                eOrderedSet.append(copy_ref_value)
                        else:
                            try:
                                copy_value = memo[orig_value]
                            except KeyError:
                                log.warning(f'Cannot find reference of type {orig_value.eClass.name} of \
                                    reference {k.eClass.name}.{ref.name} in deepcopy memo, using original')
                                copy_value = orig_value
                            v.eSet(ref.name, copy_value)
    return copy
setattr(EObject, '__deepcopy__', deepcopy)
setattr(EObject, 'deepcopy', deepcopy)

This approach also allows you to use copy.deepcopy() to deepcopy an EObject.

@aranega
Copy link
Member

aranega commented Feb 6, 2020

@ewoudwerkman That's really awesome! I will try it right away, thanks a lot! That's something that I would be clearly interesting to directly have in PyEcore. It will require a lot of tests though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants