schema_obj: refactor _search() #1222

nmoroze · 2023-01-21T22:41:04Z

This PR refactors our monolithic _search() function to make the code clearer, fix some latent bugs, improve error messages, and, most relevantly, facilitate adding special handling for global values with step/index overrides.

This PR is meant to be (mostly) a clean refactor with no change to user-facing functionality. However, there are some differences in behavior described below.

First, a couple key design decisions made in this PR:

The underlying cfg dictionary now stores native Python types, rather than just strings. As far as I can tell, there's no reason to store values as strings -- all the Python types used are serializable to/from JSON (with the exception of tuples, which get serialized as lists). Using native Python types makes the code much clearer.
_search() is no longer a monolithic function. I kept the name around, but the new _search() is a simple traversal function. The actual set()/get()/add(), etc. functionality is now written in those functions. They make use of a variety of new internal helper functions for shared functionality. I think this approach is much clearer.

Differences in behavior

`get()` no longer creates keypaths

I noticed that with the old _search(), get()-ing a keypath with unset free keys actually adds keys to the cfg dictionary:

>>> chip.getkeys('tool', 'surelog', 'task')
[] # (Surelog not set up yet)
>>> chip.get('tool', 'surelog', 'task', 'import', 'stdout', 'import', '0', 'suffix')
'log'
>>> chip.getkeys('tool', 'surelog', 'task')
['import']

It seemed a little odd to me that get() could have this side effect. I've changed the behavior in this PR to allow it to keep descending through the defaults (so you can recover a default value), but the dictionary doesn't actually get created. New behavior:

>>> chip.getkeys('tool', 'surelog', 'task')
[]
>>> chip.get('tool', 'surelog', 'task', 'import', 'stdout', 'import', '0', 'suffix')
'log'
>>> chip.getkeys()
[]

Can no longer set lists to `None`

I think set() previously allowed you to provide None for a list value. This PR changes that behavior to cause it to fail the type check, since we've generally been using [] as the equivalent null value for list types. However, I'd be open to changing this behavior to accept None for list types, and have it get normalized as [] in the schema.

Add 'set' flag for clobber

This PR adds another flag, 'set', to parameters to indicate whether their value has been set by a user yet. Previously, clobber operated based on whether the value was of an "empty" type (generally equivalent to False-y), but this caused unintuitive behavior if a value was explicitly set to an "empty" value (#1146). Adding the "set" flag allows us to determine whether a value has actually been exlicitly set by user code before

Closes #1146, #1188

There's no reason to store everything as a string - it seems simpler to keep everything as their native Python type. The only type that changes when converting to/from JSON are tuples, which get converted to lists. We can keep them as lists in the schema for the sake of normalization.

Basic idea: make _search() a low-level traversal function, and encode the distinct logic of each accessor (set/get/add etc) within those functions themselves. Adds a few helper functions to factor out other shared functionality.

Eventually these warnings should be pushed back down into the schema_obj messages once we have a static logger.

There were a couple bugs in the test: - Attempt to set list to None - Incorrect keypath But this also revealed a bug in the schema refactor: - Shouldn't allow None for lists

These values are set by the processes spawned by each node in _runtask(), and get merged into the main manifest with clobber=False. Therefore, we have to clear the set flag so they aren't dropped during the merge.

When dumping a manifest, some paths will not be able to be resolved, and find_files() will return lists with "None" entries. This seemed to work okay with the old set() method, but violates our new typechecks. In these cases the behavior was for the paths to not get included in the manifest, so it seems okay to write an empty list instead.

This matches the behavior prior to #1147

gadfort

Changes look good, I like the simplification of the type checking. If we keep set you will need to update the schema version.

siliconcompiler/core.py

siliconcompiler/schema/schema_obj.py

siliconcompiler/core.py

nmoroze · 2023-01-25T02:29:12Z

@gadfort Thanks for the review! I'll fix up the various messages per your suggestions.

RE: cfg['set'], I was originally envisioning explicitly distinguishing when the user has chosen to set the value to None, and making that require clobber to overwrite:

>>> chip.set('metric', 'syn', '0', 'errors', None)
>>> chip.set('metric', 'syn', '0', 'errors', 0, clobber=False)
# Fails!

However, in retrospect this is probably counterintuitive. I'd be open to making None a sentinel for an un-set value (and allow a user to use it to reset a value's status), and then we don't need to introduce cfg['set'].

I'm also thinking it might be reasonable to fall back to the default value in the case when a user sets None, rather than returning None itself:

>>> chip.get('option', 'quiet')
False
>>> chip.set('option', 'quiet', True)
>>> chip.set('option', 'quiet', None)
False # (not None!)

Then that just leaves the question of whether empty list should be a similar sentinel for list-types, or if empty lists should be considered a value distinct from a None sentinel for resetting their status.

siliconcompiler/schema/schema_cfg.py

siliconcompiler/schema/schema_obj.py

aolofsson · 2023-01-25T03:46:36Z

Not sure I fully grasp the set vs sentinel vs clobber tradeoffs here, especially as it relates to bool. Setting a parameter that we have specified as type bool to None per your example seems confusing to me. Same thing goes for lists. Is the idea to intercept a special "None" value entered through set and take some action. How do we distinguish from someone who actually wants to set a value to None?

print(type(None))

WRansohoff

Nice simplification! As a general comment, should we consider using SiliconCompilerError in any of the places where TypeError / ValueError is used?

Pro: Easy to catch in build scripts.

Con: Could accidentally swallow errors, and the Schema object is sort of logically separate from runtime SC errors.

siliconcompiler/schema/schema_obj.py

WRansohoff · 2023-01-25T15:31:00Z

siliconcompiler/schema/schema_obj.py

+            if value == 'true': return True
+            if value == 'false': return False


if value in ('true', 'True')? Ditto for false?

I'm thinking of keeping as-is to maintain parity with our current API, which only supports lower-case "true"/"false" for strings. However, I'd be open to changing this if others think it's a good idea.

nmoroze · 2023-01-25T18:44:39Z

@WRansohoff good points RE: exceptions. I think tweaking how they work in a future PR could be reasonable. Some of my thinking about how things should work is captured here: #1085.

I think ultimately we should have set()/get()/add() throw TypeError/ValueError where relevant, since those are standardized - but the way we're currently catching them and re-throwing SiliconCompilerErrors in the core.py implementations of set()/get()/add() is definitely unideal, since that doesn't even make the standardized errors user-facing, and it can swallow the source of other errors.

nmoroze · 2023-01-25T18:50:05Z

@gadfort I think I've addressed all the feedback! At a high level:

Keep cfg['set'], add clear() method for unsetting values (was between unset() and clear(), I liked clear() because of the analogy to list.clear() in Python)
Improved error messages. Pushed down the keypaths into the recursive checking functions to display them, added variables/helper functions for deduplicating error messages. I realized this is more clear than catching and re-raising the exception at a higher level.

gadfort

just down to the debug vs. warning. If it's debug else where maybe we should leave it and make a note to look at them later.

gadfort · 2023-01-25T18:56:32Z

siliconcompiler/core.py

+                # TODO: this message should be pushed down into Schema.set()
+                # once we have a static logger.
+                if clobber:
+                    self.logger.debug(f'Failed to set value for {keypath}: '


should this be a warning or error instead of debug?

gadfort · 2023-01-25T18:56:39Z

siliconcompiler/core.py

+                    self.logger.debug(f'Failed to set value for {keypath}: '
+                        'parameter is locked')
+                else:
+                    self.logger.debug(f'Failed to set value for {keypath}: '


gadfort · 2023-01-25T18:56:45Z

siliconcompiler/core.py

+        self.logger.debug(f'Clearing {keypath}')
+
+        if not self.schema.clear(*keypath):
+            self.logger.debug(f'Failed to clear value for {keypath}: parameter is locked')


gadfort · 2023-01-25T18:57:13Z

siliconcompiler/core.py

+            if not self.schema.add(*args, field=field):
+                # TODO: this message should be pushed down into Schema.add()
+                # once we have a static logger.
+                self.logger.debug(f'Failed to add value for {keypath}: '


Seems to be a new Yosys setup issue that it flags, possibly because the test setup doesn't align well with tool/task split changes.

nmoroze added 7 commits January 24, 2023 11:52

schema_obj: refactor basic schema accessors

8f31d4b

Basic idea: make _search() a low-level traversal function, and encode the distinct logic of each accessor (set/get/add etc) within those functions themselves. Adds a few helper functions to factor out other shared functionality.

core: add warnings when set() or add() fails

06e8273

Eventually these warnings should be pushed back down into the schema_obj messages once we have a static logger.

Fix failing core tests

fa1af81

Fix show test

2463366

There were a couple bugs in the test: - Attempt to set list to None - Incorrect keypath But this also revealed a bug in the schema refactor: - Shouldn't allow None for lists

Reset "set" flag when clearing metrics and records

cb3437c

These values are set by the processes spawned by each node in _runtask(), and get merged into the main manifest with clobber=False. Therefore, we have to clear the set flag so they aren't dropped during the merge.

Only mark parameter set when value field modified

798c0c2

nmoroze force-pushed the noah/search branch from c018b6c to 798c0c2 Compare January 24, 2023 18:59

nmoroze added 5 commits January 24, 2023 14:27

Add regression test for #1146

98107aa

schema_obj: support lists-of-lists

980eb93

core: change clobber/lock messages from warning to debug

b3006c9

This matches the behavior prior to #1147

tests: add test for no side effects from get()

8134389

nmoroze marked this pull request as ready for review January 24, 2023 20:36

nmoroze requested review from gadfort, aolofsson and WRansohoff January 24, 2023 20:36

Clean up

8ee9ebb

gadfort requested changes Jan 25, 2023

View reviewed changes

aolofsson approved these changes Jan 25, 2023

View reviewed changes

WRansohoff approved these changes Jan 25, 2023

View reviewed changes

nmoroze added 2 commits January 25, 2023 13:18

schema_obj/core: implement clear() method

7127b01

schema_obj: improve error messages

f3256d9

nmoroze added 2 commits January 25, 2023 13:46

Merge branch 'main' into noah/search

ad296a2

schema: bump version and update defaults.json

2eac54a

gadfort approved these changes Jan 25, 2023

View reviewed changes

nmoroze added 2 commits January 25, 2023 14:01

schema_obj: fix error message for type check

8e481a0

tests: skip check flowgraph I/O

1da5581

Seems to be a new Yosys setup issue that it flags, possibly because the test setup doesn't align well with tool/task split changes.

gadfort merged commit 01aefd5 into main Jan 25, 2023

gadfort deleted the noah/search branch January 25, 2023 19:18

nmoroze mentioned this pull request Feb 9, 2023

Make 'tool' schema group configurable per-node #1282

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema_obj: refactor _search() #1222

schema_obj: refactor _search() #1222

nmoroze commented Jan 21, 2023 •

edited

gadfort left a comment

nmoroze commented Jan 25, 2023 •

edited

aolofsson commented Jan 25, 2023

WRansohoff left a comment

WRansohoff Jan 25, 2023

nmoroze Jan 25, 2023

nmoroze commented Jan 25, 2023

nmoroze commented Jan 25, 2023

gadfort left a comment

gadfort Jan 25, 2023

gadfort Jan 25, 2023

gadfort Jan 25, 2023

gadfort Jan 25, 2023

		if value == 'true': return True
		if value == 'false': return False

schema_obj: refactor _search() #1222

schema_obj: refactor _search() #1222

Conversation

nmoroze commented Jan 21, 2023 • edited

Differences in behavior

get() no longer creates keypaths

Can no longer set lists to None

Add 'set' flag for clobber

gadfort left a comment

Choose a reason for hiding this comment

nmoroze commented Jan 25, 2023 • edited

aolofsson commented Jan 25, 2023

WRansohoff left a comment

Choose a reason for hiding this comment

WRansohoff Jan 25, 2023

Choose a reason for hiding this comment

nmoroze Jan 25, 2023

Choose a reason for hiding this comment

nmoroze commented Jan 25, 2023

nmoroze commented Jan 25, 2023

gadfort left a comment

Choose a reason for hiding this comment

gadfort Jan 25, 2023

Choose a reason for hiding this comment

gadfort Jan 25, 2023

Choose a reason for hiding this comment

gadfort Jan 25, 2023

Choose a reason for hiding this comment

gadfort Jan 25, 2023

Choose a reason for hiding this comment

nmoroze commented Jan 21, 2023 •

edited

`get()` no longer creates keypaths

Can no longer set lists to `None`

nmoroze commented Jan 25, 2023 •

edited