Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sweeper PR #214

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Sweeper PR #214

wants to merge 7 commits into from

Conversation

SabhyaC26
Copy link
Contributor

Fixes #143 #181

Changes proposed in this pull request:

  • new sweeper class for performing hyperparameter sweeps in tango

Before submitting

  • I've read and followed all steps in the Making a pull request
    section of the CONTRIBUTING docs.
  • I've updated or added any relevant docstrings following the syntax described in the
    Writing docstrings section of the CONTRIBUTING docs.
  • If this PR fixes a bug, I've added a test that will fail without my fix.
  • If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

  • All GitHub Actions jobs for my pull request have passed.

@SabhyaC26 SabhyaC26 requested a review from epwalsh March 1, 2022 18:07
tango/sweeps.py Outdated
Comment on lines 40 to 44
main_config = self.override_hyperparameters(combination)
# TODO: need to figure where & how to store results / way to track runs
with run_experiment(main_config, include_package=[self.components]) as run_dir:
# TODO: fill in something here?
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_experiment is just for testing. We probably want to run our sweep experiments in subprocesses so that we can execute them in parallel. You could do something like this:

Suggested change
main_config = self.override_hyperparameters(combination)
# TODO: need to figure where & how to store results / way to track runs
with run_experiment(main_config, include_package=[self.components]) as run_dir:
# TODO: fill in something here?
pass
subprocess.call(["tango", "run", self.main_config_path, "--overrides", json.dumps(overrides)])

Of course this doesn't make anything run in parallel, but would be a good first step.

tango/sweeps.py Outdated


class Sweeper(Registrable):
def __init__(self, main_config_path: str, sweeps_config_path: str, components: str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Sweeper will also need to be aware of the target_step that spits out metric we're trying to optimize for, and the workspace.

Suggested change
def __init__(self, main_config_path: str, sweeps_config_path: str, components: str):
def __init__(self, main_config_path: str, sweeps_config_path: str, components: str, target_step: str, workspace: Workspace):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to start by implementing #142 in a separate PR

# data class that loads the parameters
# TODO: unsure about how to specify a default here?
@dataclass(frozen=True)
class SweepConfig(Params):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class SweepConfig(Params):
class SweepConfig(FromParams):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See

class TangoGlobalSettings(FromParams):
for example

@@ -0,0 +1,393 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this from the PR? You can tell git to ignore this file by adding it to your global git ignore

tango/sweeps.py Outdated
Comment on lines 8 to 16
main_config_path = (
"/Users/sabhyac/Desktop/sabhya/tango/test_fixtures/sweeps/basic_test/config.jsonnet"
)
sweeps_config_path = (
"/Users/sabhyac/Desktop/sabhya/tango/test_fixtures/sweeps/basic_test/sweeps-config.jsonnet"
)
components = (
"/Users/sabhyac/Desktop/sabhya/tango/test_fixtures/sweeps/basic_test/basic_arithmetic.py"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
main_config_path = (
"/Users/sabhyac/Desktop/sabhya/tango/test_fixtures/sweeps/basic_test/config.jsonnet"
)
sweeps_config_path = (
"/Users/sabhyac/Desktop/sabhya/tango/test_fixtures/sweeps/basic_test/sweeps-config.jsonnet"
)
components = (
"/Users/sabhyac/Desktop/sabhya/tango/test_fixtures/sweeps/basic_test/basic_arithmetic.py"
)

self,
main_config_path: str,
sweeps_config_path: str,
components: str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Sweeper class needs to be aware of the Workspace being used in order to get the results of each target step, and to pass the workspace URL when it runs each experiment (with the -w option).

self.components = components

# returns all the combinations of hyperparameters in the form of a list of lists
def get_combinations(self) -> list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compat with older Python versions:

Suggested change
def get_combinations(self) -> list:
def get_combinations(self) -> List:

self.components = components

# returns all the combinations of hyperparameters in the form of a list of lists
def get_combinations(self) -> list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be better if this method was a generator function so that implementations could dynamically pick the next combination based on the history so far.

Comment on lines +52 to +53
# function to override all the hyperparameters in the current experiment_config
def override_hyperparameters(self, experiment_tuple: tuple) -> dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings should be written like this:

Suggested change
# function to override all the hyperparameters in the current experiment_config
def override_hyperparameters(self, experiment_tuple: tuple) -> dict:
def override_hyperparameters(self, experiment_tuple: tuple) -> dict:
"""
Generates a overrides dictionary for the current experiment config.
"""

sweeps_config_path: str,
components: str,
):
super(Registrable, self).__init__()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In recent versions of Python you can just do this:

Suggested change
super(Registrable, self).__init__()
super().__init__()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hyperparameter sweeps
2 participants