Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attributes lost from Dict child classes #101

Open
MisterVladimir opened this issue Aug 20, 2018 · 11 comments
Open

Attributes lost from Dict child classes #101

MisterVladimir opened this issue Aug 20, 2018 · 11 comments

Comments

@MisterVladimir
Copy link
Contributor

MisterVladimir commented Aug 20, 2018

When I implement my own class that inherits from Dict, I get an error accessing the attributes of the nested items. That is, anything besides the top node does not contain my class' attributes.

The class:

from six import string_types
from collections import Iterable
from addict import Dict


def isiterable(arg):
    # copied from
    # https://stackoverflow.com/questions/1055360/how-to-tell-a-variable-is-iterable-but-not-a-string/44328500#44328500
    return isinstance(arg, Iterable) and not isinstance(arg, string_types)


class IndexedDict(Dict):
    """
    Allows setting and getting keys/values by passing in the key index. 

    We cannot use an integer key to set a value to None. The workaround is to
    use a slice:
    >>> d = IndexedDict()
    >>> d['a'] = 0
    >>> d.iloc(slice(1), [None])
    >>> d
    {'a': None}
    """
    def _get_with_int(self, key, value):
        return self[key]

    def _get_with_slice(self, key, value):
        return [self[k] for k in key]

    def _set_with_int(self, key, value):
        self[key] = value

    def _set_with_slice(self, key, value):
        for k, v in zip(key, value):
            self[k] = v

    def iloc(self, i, value=None):
        # throws IndexError if the key has not already been set
        keys = list(self.keys())[i]
        method_dict = {(True, False): self._get_with_int,
                       (True, True): self._get_with_slice,
                       (False, False): self._set_with_int,
                       (False, True): self._set_with_slice}

        method = method_dict[(value is None, 
                              isiterable(keys) and isiterable(value))]
        return method(keys, value)

The errors come up when I do:

d = IndexedDict()
d['a']['b']['c'] = 5
type(d['a']['b']) # returns Dict
# set value of 'c' key to 1
# tries to call iloc() on Dict
d['c'].iloc(0, 1)

I think this can be fixed by changing __getitem__ to return a instance of self.__class__ instead of Dict.

def __getitem__(self, name):
    if name not in self:
        return self.__class__(__parent=self, __key=name)
    return super().__getitem__(name)

or maybe instead of using the if/then statement in __getitem__ we could just add a __missing__ method and get rid of __getitem__ altogether. Besides the if/then statement, as it stands Dict's __getitem__ is the same as dict.__getitem__

def __missing__(self, name):
    return self.__class__(__parent=self, __key=name)

instead of

def __getitem__(self, name):
    if name not in self:
        return Dict(__parent=self, __key=name)
    return super().__getitem__(name)

To be consistent, I'd also change any references to super(Dict, self) to super() e.g. in __deepcopy__.

@mewwts
Copy link
Owner

mewwts commented Aug 21, 2018

Hey @MisterVladimir! Thanks for contributing.

I'm happy to support your usecase and your suggestions look sound. Would you be open to trying these changes in a pull request and see if the tests pass? You can remove superfluous test cases, if any.

@MisterVladimir
Copy link
Contributor Author

Thanks @mewwts. I'm working on it now, namely, on getting test_addict.Tests to run the same tests on Dict, or a child class. Expect a pull request this evening.

MisterVladimir pushed a commit to MisterVladimir/addict that referenced this issue Aug 21, 2018
Previously, when child classes were instantiated, all their nodes -- besides the top node -- remained instances of the parent (Dict) class.

Added tests appropriate for these changes. All tests passed.

See mewwts#101
@MisterVladimir
Copy link
Contributor Author

MisterVladimir commented Aug 22, 2018

I'm a little new to Travis, tests, etc., so apologies for polluting the commit history with incremental bug fixes, but it looks like tests pass on my machine with "python setup.py test" and with Travis. The second commit I made, which changed the test script, built with Python 3.6+ but not Python 2, probably because I used "super()" instead of the old "super(cls, obj)". That's been remedied in subsequent commits. Looks like the package builds with Python 2 and 3.

mewwts pushed a commit that referenced this issue Aug 23, 2018
* Fixed Subclassing Dict Issues (Issue #101)

Previously, when child classes were instantiated, all their nodes -- besides the top node -- remained instances of the parent (Dict) class.

Added tests appropriate for these changes
@mewwts
Copy link
Owner

mewwts commented Aug 23, 2018

No worries @MisterVladimir, I've wrestled countless times with that!

I've gone ahead and merged #102.

Thanks!

@mewwts
Copy link
Owner

mewwts commented Aug 23, 2018

I've released 2.2.0 as well https://github.com/mewwts/addict/releases/tag/v2.2.0.

Cheers!

@mewwts mewwts closed this as completed Aug 23, 2018
@irvinlim
Copy link

irvinlim commented Aug 23, 2018

I've just ran into several problems with this change, which caused breaking changes with my project, though it is understandable why they occurred after reading this issue. I just wanted to say that existing projects could break due to various assumptions we used to have before this change, so perhaps this might be a heads-up for others who also found their projects breaking from today onwards.


Firstly, I have a base class which extends Dict and some subclasses, like as follows:

from addict import Dict


class BaseStrategy(Dict):
	name = None

	def __init__(self):
		super(BaseStrategy, self).__init__()

		# This line breaks because the class variable
		# name is the same as the instance variable name.
		self.name = self.__class__.name


class ExampleStrategy(BaseStrategy):
	name = 'EXAMPLE'


example = ExampleStrategy()

The error we get:

Traceback (most recent call last):
  File "poc/setattr.py", line 19, in <module>
    example = ExampleStrategy()
  File "poc/setattr.py", line 12, in __init__
    self.name = self.__class__.name
  File "/usr/local/lib/python2.7/dist-packages/addict/addict.py", line 27, in __setattr__
    "'{0}' is read-only".format(name))
AttributeError: 'Dict' object attribute 'name' is read-only

As I found out, this is bad practice so I was okay to rectify my code accordingly and rename the class variable to be a different name from my member variable.


Secondly, perhaps slightly more concerning, is that the new __missing__ method creates a new instance without passing the correct arguments to __init__. This could break projects which assumes that addict returns {} for missing attributes for subclasses of Dict, or if we assume that the parent class is addict.Dict and therefore we can just initialise member variables through chaining:

from addict import Dict


class CustomDict(Dict):
	def __init__(self, value):
		super(CustomDict, self).__init__()

		# The problem is that I do not initialize self.my as a Dict(),
		# since I previously assumed that this would work.
		self.my.custom.value = value


class CustomDict2(Dict):
	def __init__(self, value, **kwargs):
		super(CustomDict2, self).__init__()
		self.my.custom.value = value


class CustomDict3(Dict):
	def __init__(self, value=None, **kwargs):
		super(CustomDict3, self).__init__()
		self.my.custom.value = value


# Try running any of the following lines on its own
# custom_dict = CustomDict('hello world')
# custom_dict2 = CustomDict2('hello world')
# custom_dict3 = CustomDict3('hello world')

# The correct fix would be to initialize self.my as a Dict() first.


# This also breaks when I try to access a non-existent attribute:
class CustomDict4(Dict):
	def __init__(self, value, **kwargs):
		super(CustomDict4, self).__init__()
		self.value = value


custom_dict4 = CustomDict4(1)
assert not custom_dict4.missing_attr

# The correct fix would be to do:
# assert 'missing_attr' not in custom_dict4

Some of the errors:

Traceback (most recent call last):
  File "poc/getitem.py", line 26, in <module>
    custom_dict = CustomDict('hello world')
  File "poc/getitem.py", line 10, in __init__
    self.my.custom.value = value
  File "/usr/local/lib/python2.7/dist-packages/addict/addict.py", line 62, in __getattr__
    return self.__getitem__(item)
  File "/usr/local/lib/python2.7/dist-packages/addict/addict.py", line 65, in __missing__
    return self.__class__(__parent=self, __key=name)
TypeError: __init__() got an unexpected keyword argument '__key'
Traceback (most recent call last):
  File "poc/getitem.py", line 39, in <module>
    assert not custom_dict4.missing_attr
  File "/usr/local/lib/python2.7/dist-packages/addict/addict.py", line 62, in __getattr__
    return self.__getitem__(item)
  File "/usr/local/lib/python2.7/dist-packages/addict/addict.py", line 65, in __missing__
    return self.__class__(__parent=self, __key=name)
TypeError: __init__() takes exactly 2 arguments (1 given)

Although this was a minor version bump, it may have caused quite drastic inconvenience for others as well. I'm not really sure what recommendation to give, but just wanted to mention this.

@mewwts
Copy link
Owner

mewwts commented Aug 23, 2018

Thank you so much @irvinlim for pointing this out!

First of all, sorry for breaking this for you. I was unaware of the implications this would have for clients inheriting from addict.Dict and the test cases did not cover the issue. Please pin addict to a version that works for you while we figure out what's sensible here.

If I understand correctly, @MisterVladimir's test-case is so that for a subclass of addict.Dict, getting a missing variable on the subclass should return an instance of that subclass.

Before this change setting a missing attribute on the subclass would return a addict.Dict instance.This was the functionality you were depending on @irvinlim.

The problem arises when the subclass defines its own __init__ function with custom parameters. The following line will try to call __init__ on the class without knowing its parameters.

addict/addict/addict.py

Lines 64 to 65 in cf29d47

def __missing__(self, name):
return self.__class__(__parent=self, __key=name)

As far as I can see it's not possible to support both these usecases at the same time. On the one hand, I feel like what we just merged is the more expected functionality, however inheriting from addict and accepting custom parameters in __init__ will break if you don't implement __missing__ on your own. So it's kind of a halfway there solution.

On the other hand, this probably breaks the appeal of inheriting from addict in config classes and the like.

I'm torn, but I'm open to reverting. I need to think some more about it.

Do you guys have any other thoughts on this @irvinlim and @MisterVladimir?

@mewwts mewwts reopened this Aug 23, 2018
@irvinlim
Copy link

I can see where @MisterVladimir is coming from with the fix, as he expects the descendant nodes to be the same class as the object itself, meanwhile I expected my Dict subclasses to inherit the dot-notation elegance from addict.

Actually, I don't think there's a right answer to the behaviour when subclassing addict.Dict, because I suppose you wrote this library to allow elegant quick dot-notation access, and not meant as a base class of sorts. Because you did not design for this, I suppose any breaking changes might be reasonable.

My only gripe would be that error messages are now much more cryptic, due to the fact that users unaware of the internal implementation would see messages like this:

TypeError: __init__() got an unexpected keyword argument '__key'

when in fact the error is due to a missing attribute or method, perhaps due to a typo. In fact, a KeyError thrown might be more meaningful, but it doesn't help that it's still another backwards-incompatible change for those expected it to return a new, empty Dict().

I am perfectly okay with this change actually, because the breaking changes didn't take me very long to fix, but I just wonder how many others may be depending on the same functionality as I did.

@MisterVladimir
Copy link
Contributor Author

MisterVladimir commented Aug 23, 2018

Thanks for the lively discussion, everyone. I agree that the most concerning issue is

the new '''missing''' method creates a new instance without passing the correct arguments to '''init'''.

Basically v2.2.0 throws an error if the child class' __init__ has "regular" arguments and sets keyword arguments to the default values. Would this be too ugly a fix for your problem, @irvinlim's?

def __missing__(self, key):
    try:
        return self.__class__(__parent=self, __key=name)
    except: #maybe limit to TypeError and RecursionError
        return Dict(__parent=self, __key=name)

I never realized one would want to implement an __init__ in child classes. I see a Dict subclass still as a data structure whose instances don't reach back and modify parent nodes. Specifically, my use case is to store info extracted from a zip archive, whose files often have silly naming schemes. It helps to access keys via an integer index. Perhaps down the line I'd add a method for recursively writing the dict to an hdf5 file. Although I too enjoy the convenience of the dot notation -- which I preserve in my subclasses -- I'm open to better data types for storing this info. Which leads me to:

On the other hand, this probably breaks the appeal of inheriting from addict in config classes and the like.

@mewwts, would you be so kind as to link to an example? I actually need to make a config file for my own program, so a pointer would be a big help to this n00b. : )

@mewwts
Copy link
Owner

mewwts commented Nov 8, 2018

Hey @MisterVladimir.

I thought of a config class just as something like

class Config:
  s=13
  n=12

which you can then pass around.

I'm closing this now as it doesn't seem to be a problem for more users. @irvinlim thanks for using addict and reporting this.

@mewwts
Copy link
Owner

mewwts commented Feb 11, 2019

I'm reopening this as several users has hinted towards issues with the new implementation.

Repository owner deleted a comment from leggod Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants