Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support json read for Categorical with tuple entries. #357

Open
charlesjhill opened this issue Apr 10, 2024 · 2 comments · May be fixed by #346
Open

[Feature Request] Support json read for Categorical with tuple entries. #357

charlesjhill opened this issue Apr 10, 2024 · 2 comments · May be fixed by #346

Comments

@charlesjhill
Copy link

Version & copy-pastable script

Version:

from importlib.metadata import version
import ConfigSpace
version("ConfigSpace")
# '0.7.1'
ConfigSpace.__version__
# '0.6.1'

Copy-pastable script:

from ConfigSpace import ConfigurationSpace, Categorical
from ConfigSpace.read_and_write.json import read, write

cs = ConfigurationSpace()
cs.add_hyperparameter(Categorical("dims", [(1e-1, 1e-2), 1e-2, (1e-2, 1e-4), 1e-4]))
json_repr = write(cs)
cs_2 = read(json_repr)  # Crashes

This works:

from ConfigSpace import ConfigurationSpace, Categorical

cs = ConfigurationSpace()
cs.add_hyperparameter(Categorical("dims", [(1e-1, 1e-2), 1e-2, (1e-2, 1e-4), 1e-4]))
cs.sample_configuration()
# Configuration(values={
#  'dims': 0.01,
# })

But a round-trip to JSON doesn't.

from ConfigSpace.read_and_write.json import read, write
json_repr = write(cs)
# {
#   "hyperparameters": [
#     {
#       "name": "dims",
#       "type": "categorical",
#       "choices": [
#         [
#           0.1,
#           0.01
#         ],
#         0.01,
#         [
#           0.01,
#           0.0001
#         ],
#         0.0001
#       ],
#       "default": [
#         0.1,
#         0.01
#       ],
#       "weights": null
#     }
#   ],
#   "conditions": [],
#   "forbiddens": [],
#   "python_module_version": "0.6.1",
#   "json_format_version": 0.4
# }
read(json_repr)  # <-- Crashes

The stacktrace is:

File ~/envs/test_env/lib/python3.11/site-packages/ConfigSpace/read_and_write/json.py:552, in _construct_hyperparameter(hyperparameter)
    551 if hp_type == "categorical":
--> 552     return CategoricalHyperparameter(
    553         name=name,
    554         choices=hyperparameter["choices"],
    565         default_value=hyperparameter["default"],
    566         weights=hyperparameter.get("weights"),
    567     )

File ~/envs/test_env/lib/python3.11/site-packages/ConfigSpace/hyperparameters/categorical.pyx:72, in ConfigSpace.hyperparameters.categorical.CategoricalHyperparameter.__init__()

File ~/envs/test_env/lib/python3.11/collections/__init__.py:599, in Counter.__init__(self, iterable, **kwds)
    588 '''Create a new, empty Counter object.  And if given, count elements
    589 from an input iterable.  Or, initialize the count from another mapping
    590 of elements to their counts.
   (...)
    596
    597 '''
    598 super().__init__()
--> 599 self.update(iterable, **kwds)

File ~/envs/test_env/lib/python3.11/collections/__init__.py:690, in Counter.update(self, iterable, **kwds)
    688             super().update(iterable)
    689     else:
--> 690         _count_elements(self, iterable)
    691 if kwds:
    692     self.update(kwds)

TypeError: unhashable type: 'list'

In short, the json write converts the tuples to lists, but this isn't restored on the read. The CategoricalHyperparameter constructor requires that choices be hashable, and lists are not. I can thing of a few possible solutions; none would be very intensive.

  1. Tell users to convert their "exotic" types to strings and to parse in their code.
  2. Expose the cls kwarg from json.{dumps,loads} in read_and_write.json.{read,write} so users can pass a custom JSONEncoder or JSONDecoder, respectively. This would also allow for serialization/deserialization of other types of interest.
  3. Modify ConfigSpace.read_and_write.json._construct_hyperparameter to convert the list-typed elements of hyperparameter[{"choices", "sequence"}] and hyperparameter["default"] to a tuple, if needed, for categorical and ordinal hyperparameters.

The third option is the hack I'm doing for my use-case, but the second seems more robust and forward-looking. I'm filing the issue because I don't like option 1, of course :)

I can make a PR if there's interest. Cheers~

@eddiebergman
Copy link
Contributor

eddiebergman commented Apr 16, 2024

Hi there,

Yeah unfortunatly json doesn't keep tuple types in serialization. I'm kind of surprised that you could use tuples in a Categorical in the first place.

Some updates from #346 which might be relevant:

  • I've upgraded the json read/write to accepted arbitrary user defined encoder/decoders (now accessible through space.to_json(path) and ConfigurationSpace.from_json(path)). You can overwrite a decoder for a specific "type" that is written into the json output. I've provided an example based on your helpful reproducible example.
    from typing import Any
    
    from ConfigSpace import CategoricalHyperparameter, ConfigurationSpace
    from ConfigSpace.read_and_write.dictionary import _decode_categorical
    
    cs = ConfigurationSpace(
        {
            "dims": [(1e-1, 1e-2), 1e-2, (1e-2, 1e-4), 1e-4],
        },
    )
    cs.to_json("k.json")
    
    # !cat k.json
    # {"name": null, "hyperparameters": [{"type": "categorical", "name": "dims", "choices": [[0.1, 0.01], 0.01, [0.01, 0.0001], 0.0001], "weights": null, "default_value": [0.1, 0.01], "meta": null}], "conditions": [], "forbiddens": [], "python_module_version": "0.7.2", "format_version": 0.4}
    
    
    def my_categorical_decoder(
        item: dict[str, Any],
        cs: ConfigurationSpace,
        decode,
    ) -> CategoricalHyperparameter:
        if item["name"] == "dims":
            # Convert things to tuples
            item["choices"] = [
                tuple(x) if isinstance(x, list) else x for x in item["choices"]
            ]
            dv = item["default_value"]
            item["default_value"] = tuple(dv) if isinstance(dv, list) else dv
        return _decode_categorical(item, cs, decode)
    
    
    custom_decoders = {"hyperparameters": {"categorical": my_categorical_decoder}}
    space = ConfigurationSpace.from_json("k.json", decoders=custom_decoders)
    print(space)
    # Configuration space object:                                                                                                                                                                     
    # 	Hyperparameters:                                                                                                                                                                              
    # 		dims, Type: Categorical, Choices: {(0.1, 0.01), 0.01, (0.01, 0.0001), 0.0001}, Default: (0.1, 0.01)  
  • On the part that I'm surprised it worked with tuples, this will be officially supported in a follow up PR from feat(Hyperparameters): Allow arbitrary objects in category ordinal #359

We hope to release this sometime next week :)

@eddiebergman eddiebergman linked a pull request Apr 16, 2024 that will close this issue
@charlesjhill
Copy link
Author

Awesome, the ability to pass a decoder that targets a particular key especially will make the process less prone to side-effects compared to using a JSONDecoder which must decide which transformations to apply based on type of input alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants