You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When consulting a Python project using Hamilton, there is no way to tell which files are "Hamilton modules".
This has several implications:
User doesn't know what can be imported and passed to a Driver
User might unknowingly add functions to a module, rendering it invalid for Hamilton
Project and IDE tooling for Hamilton don't have a standardized / centralized way to identify Hamilton modules
User / tools can't know which combinations of modules can be passed together to a Driver
I touched on a similar topic in Issue #747 in the context of the CLI.
Benefits
I proposed the notion of Project (to map to Hamilton UI "project"; maybe "workspace" is better) to allow users to specify "Hamilton modules".
Features it could unlock:
LSP: multi-module features
code navigation. You're currently editing hello.py, but the LSP builds the dataflow with both hello.py and world.py and knows about their nodes.
visualization. Allow to view multiple modules in the VSCode extension instead of only current file
CLI / pre-commit / CI: apply to all
validate all modules. The pre-commit can attempt to build all "single" and "composed" dataflows
generate all visualizations. Use the CLI to generate visualizations of all modules on command or commit
Hamilton UI
sync catalog without execution. The UI could better separate "historical dataflows" that were executed from "available dataflows" representing the state of the current code
API design
Hamilton is designed around 2 layers: dataflow definition and dataflow execution. This API relates to dataflow definition, which requires knowing:
required: Python modules (file paths; one or more)
optional: Driver config (dict)
Given Hamilton is Python-centric, it should adopt pyproject.toml as a standard. The TOML format is also well-supported by other languages for parsing (e.g., TypeScript in VSCode extension, future Rust dev tools). The format supports the relevant types to specify the Python modules and config.
Example TOML; it provides flexibility for specifying dataflow definition
# shortform notation
[tool.hamilton]
dataflows = [
{ name = "greetings", modules = ["world.py"] },
{ modules = ["hello.py"] }, # `name` is inferred when `len(modules) == 1`
]
# longform notation# mutually exclusive with shortform because they both use `tool.hamilton.dataflows`
[[tool.hamilton.dataflows]] # this adds to the list `hamilton.dataflows`modules = ["single.py"] # `name` is inferred when `len(modules) == 1`
[[tool.hamilton.dataflows]]
name = "composed"modules = ["a.py", "b.py"] # list `hamilton.dataflows[i].modules[...]`
[[tool.hamilton.dataflows]]
name = "inline_config"modules = ["a.py"]
config = { env = "dev", owner = "me" } # mapping `hamilton.dataflows[i].config{...}`
[[tool.hamilton.dataflows]]
name = "multiline_config"modules = ["a.py"]
config.env = "dev"# key-value pair `hamilton.dataflows[i].config{env: "dev"}`config.owner = "me"config.key1 = trueconfig.key2 = falseconfig.key3 = 12345
API extensibility
Currently, we only define tool.hamilton.dataflows, but we can add more configurations.
The text was updated successfully, but these errors were encountered:
Current Limitations
When consulting a Python project using Hamilton, there is no way to tell which files are "Hamilton modules".
This has several implications:
Driver
modules
can be passed together to aDriver
I touched on a similar topic in Issue #747 in the context of the CLI.
Benefits
I proposed the notion of
Project
(to map to Hamilton UI "project"; maybe "workspace" is better) to allow users to specify "Hamilton modules".Features it could unlock:
LSP: multi-module features
hello.py
, but the LSP builds the dataflow with bothhello.py
andworld.py
and knows about their nodes.CLI / pre-commit / CI: apply to all
Hamilton UI
API design
Hamilton is designed around 2 layers: dataflow definition and dataflow execution. This API relates to dataflow definition, which requires knowing:
Given Hamilton is Python-centric, it should adopt
pyproject.toml
as a standard. TheTOML
format is also well-supported by other languages for parsing (e.g., TypeScript in VSCode extension, future Rust dev tools). The format supports the relevant types to specify the Python modules and config.Example TOML; it provides flexibility for specifying dataflow definition
API extensibility
Currently, we only define
tool.hamilton.dataflows
, but we can add more configurations.The text was updated successfully, but these errors were encountered: