Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema_obj: refactor _search() #1222

Merged
merged 19 commits into from Jan 25, 2023
Merged

schema_obj: refactor _search() #1222

merged 19 commits into from Jan 25, 2023

Conversation

nmoroze
Copy link
Contributor

@nmoroze nmoroze commented Jan 21, 2023

This PR refactors our monolithic _search() function to make the code clearer, fix some latent bugs, improve error messages, and, most relevantly, facilitate adding special handling for global values with step/index overrides.

This PR is meant to be (mostly) a clean refactor with no change to user-facing functionality. However, there are some differences in behavior described below.

First, a couple key design decisions made in this PR:

  • The underlying cfg dictionary now stores native Python types, rather than just strings. As far as I can tell, there's no reason to store values as strings -- all the Python types used are serializable to/from JSON (with the exception of tuples, which get serialized as lists). Using native Python types makes the code much clearer.
  • _search() is no longer a monolithic function. I kept the name around, but the new _search() is a simple traversal function. The actual set()/get()/add(), etc. functionality is now written in those functions. They make use of a variety of new internal helper functions for shared functionality. I think this approach is much clearer.

Differences in behavior

get() no longer creates keypaths

I noticed that with the old _search(), get()-ing a keypath with unset free keys actually adds keys to the cfg dictionary:

>>> chip.getkeys('tool', 'surelog', 'task')
[] # (Surelog not set up yet)
>>> chip.get('tool', 'surelog', 'task', 'import', 'stdout', 'import', '0', 'suffix')
'log'
>>> chip.getkeys('tool', 'surelog', 'task')
['import']

It seemed a little odd to me that get() could have this side effect. I've changed the behavior in this PR to allow it to keep descending through the defaults (so you can recover a default value), but the dictionary doesn't actually get created. New behavior:

>>> chip.getkeys('tool', 'surelog', 'task')
[]
>>> chip.get('tool', 'surelog', 'task', 'import', 'stdout', 'import', '0', 'suffix')
'log'
>>> chip.getkeys()
[]

Can no longer set lists to None

I think set() previously allowed you to provide None for a list value. This PR changes that behavior to cause it to fail the type check, since we've generally been using [] as the equivalent null value for list types. However, I'd be open to changing this behavior to accept None for list types, and have it get normalized as [] in the schema.

Add 'set' flag for clobber

This PR adds another flag, 'set', to parameters to indicate whether their value has been set by a user yet. Previously, clobber operated based on whether the value was of an "empty" type (generally equivalent to False-y), but this caused unintuitive behavior if a value was explicitly set to an "empty" value (#1146). Adding the "set" flag allows us to determine whether a value has actually been exlicitly set by user code before

Closes #1146, #1188

There's no reason to store everything as a string - it seems simpler to
keep everything as their native Python type.

The only type that changes when converting to/from JSON are tuples,
which get converted to lists. We can keep them as lists in the schema
for the sake of normalization.
Basic idea: make _search() a low-level traversal function, and encode
the distinct logic of each accessor (set/get/add etc) within those
functions themselves. Adds a few helper functions to factor out other
shared functionality.
Eventually these warnings should be pushed back down into the schema_obj
messages once we have a static logger.
There were a couple bugs in the test:
- Attempt to set list to None
- Incorrect keypath

But this also revealed a bug in the schema refactor:
- Shouldn't allow None for lists
These values are set by the processes spawned by each node in
_runtask(), and get merged into the main manifest with clobber=False.
Therefore, we have to clear the set flag so they aren't dropped during
the merge.
When dumping a manifest, some paths will not be able to be resolved, and
find_files() will return lists with "None" entries. This seemed to work
okay with the old set() method, but violates our new typechecks. In
these cases the behavior was for the paths to not get included in the
manifest, so it seems okay to write an empty list instead.
@nmoroze nmoroze marked this pull request as ready for review January 24, 2023 20:36
Copy link
Contributor

@gadfort gadfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, I like the simplification of the type checking. If we keep set you will need to update the schema version.

siliconcompiler/core.py Outdated Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Outdated Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Outdated Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Outdated Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
siliconcompiler/core.py Outdated Show resolved Hide resolved
@nmoroze
Copy link
Contributor Author

nmoroze commented Jan 25, 2023

@gadfort Thanks for the review! I'll fix up the various messages per your suggestions.

RE: cfg['set'], I was originally envisioning explicitly distinguishing when the user has chosen to set the value to None, and making that require clobber to overwrite:

>>> chip.set('metric', 'syn', '0', 'errors', None)
>>> chip.set('metric', 'syn', '0', 'errors', 0, clobber=False)
# Fails!

However, in retrospect this is probably counterintuitive. I'd be open to making None a sentinel for an un-set value (and allow a user to use it to reset a value's status), and then we don't need to introduce cfg['set'].

I'm also thinking it might be reasonable to fall back to the default value in the case when a user sets None, rather than returning None itself:

>>> chip.get('option', 'quiet')
False
>>> chip.set('option', 'quiet', True)
>>> chip.set('option', 'quiet', None)
False # (not None!)

Then that just leaves the question of whether empty list should be a similar sentinel for list-types, or if empty lists should be considered a value distinct from a None sentinel for resetting their status.

siliconcompiler/schema/schema_cfg.py Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Outdated Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Outdated Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
siliconcompiler/schema/schema_obj.py Show resolved Hide resolved
@aolofsson
Copy link
Member

Not sure I fully grasp the set vs sentinel vs clobber tradeoffs here, especially as it relates to bool. Setting a parameter that we have specified as type bool to None per your example seems confusing to me. Same thing goes for lists. Is the idea to intercept a special "None" value entered through set and take some action. How do we distinguish from someone who actually wants to set a value to None?

print(type(None))

<class 'NoneType'>

Copy link
Contributor

@WRansohoff WRansohoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplification! As a general comment, should we consider using SiliconCompilerError in any of the places where TypeError / ValueError is used?

Pro: Easy to catch in build scripts.

Con: Could accidentally swallow errors, and the Schema object is sort of logically separate from runtime SC errors.

siliconcompiler/schema/schema_obj.py Outdated Show resolved Hide resolved
Comment on lines +294 to +295
if value == 'true': return True
if value == 'false': return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if value in ('true', 'True')? Ditto for false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of keeping as-is to maintain parity with our current API, which only supports lower-case "true"/"false" for strings. However, I'd be open to changing this if others think it's a good idea.

@nmoroze
Copy link
Contributor Author

nmoroze commented Jan 25, 2023

@WRansohoff good points RE: exceptions. I think tweaking how they work in a future PR could be reasonable. Some of my thinking about how things should work is captured here: #1085.

I think ultimately we should have set()/get()/add() throw TypeError/ValueError where relevant, since those are standardized - but the way we're currently catching them and re-throwing SiliconCompilerErrors in the core.py implementations of set()/get()/add() is definitely unideal, since that doesn't even make the standardized errors user-facing, and it can swallow the source of other errors.

@nmoroze
Copy link
Contributor Author

nmoroze commented Jan 25, 2023

@gadfort I think I've addressed all the feedback! At a high level:

  • Keep cfg['set'], add clear() method for unsetting values (was between unset() and clear(), I liked clear() because of the analogy to list.clear() in Python)
  • Improved error messages. Pushed down the keypaths into the recursive checking functions to display them, added variables/helper functions for deduplicating error messages. I realized this is more clear than catching and re-raising the exception at a higher level.

Copy link
Contributor

@gadfort gadfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just down to the debug vs. warning. If it's debug else where maybe we should leave it and make a note to look at them later.

# TODO: this message should be pushed down into Schema.set()
# once we have a static logger.
if clobber:
self.logger.debug(f'Failed to set value for {keypath}: '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a warning or error instead of debug?

self.logger.debug(f'Failed to set value for {keypath}: '
'parameter is locked')
else:
self.logger.debug(f'Failed to set value for {keypath}: '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

self.logger.debug(f'Clearing {keypath}')

if not self.schema.clear(*keypath):
self.logger.debug(f'Failed to clear value for {keypath}: parameter is locked')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

if not self.schema.add(*args, field=field):
# TODO: this message should be pushed down into Schema.add()
# once we have a static logger.
self.logger.debug(f'Failed to add value for {keypath}: '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Seems to be a new Yosys setup issue that it flags, possibly because the
test setup doesn't align well with tool/task split changes.
@gadfort gadfort merged commit 01aefd5 into main Jan 25, 2023
@gadfort gadfort deleted the noah/search branch January 25, 2023 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make clobber interact appropriately with default values
4 participants