Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URI support to tarball path and xclbin path #1432

Merged
merged 58 commits into from
Mar 2, 2023
Merged

Conversation

sifive-benjamin-morse
Copy link
Contributor

@sifive-benjamin-morse sifive-benjamin-morse commented Feb 15, 2023

  • This allows config_hwdb.yaml to use URI's for the driver_tar and xclbin keys.
  • This also fixes an issue in PR Changed "firesim infrasetup" to deploy using a tarball.  #1299 regarding server_hardware_config being a string
  • CI runs which previously used wget to download an .xclbin (such as run-linux-poweroff-vitis.py) can now directly specify the URL as a URI

Related PRs / Issues

UI / API Impact

Allows much easier collaboration with new developers as pre-compiled resources can be put on the web simplifying first time runs.

Verilog / AGFI Compatibility

Should be no change.

Contributor Checklist

  • Is this PR's title suitable for inclusion in the changelog and have you added a changelog:<topic> label?
  • Did you add Scaladoc/docstring/doxygen to every public function/method?
  • Did you add at least one test demonstrating the PR?
  • Did you delete any extraneous prints/debugging code?
  • Did you state the UI / API impact?
  • Did you specify the Verilog / AGFI compatibility impact?
  • If applicable, did you regenerate and publicly share default AGFIs?
  • If applicable, did you apply the ci:fpga-deploy label?
  • If applicable, did you apply the Please Backport label?

Reviewer Checklist (only modified by reviewer)

Note: to run CI on PRs from forks, comment @Mergifyio copy main and manage the change from the new PR.

  • Is the title suitable for inclusion in the changelog and does the PR have a changelog:<topic> label?
  • Did you mark the proper release milestone?
  • Did you check whether all relevant Contributor checkboxes have been checked?

@sifive-benjamin-morse sifive-benjamin-morse added changelog:added Put PR title in 'Added' section of changelog ci:fpga-deploy labels Feb 15, 2023
conda-reqs.yaml Outdated Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/runtools/runtime_config.py Outdated Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/runtools/firesim_topology_elements.py Outdated Show resolved Hide resolved
.github/scripts/run-linux-poweroff-vitis.py Outdated Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/runtools/runtime_config.py Outdated Show resolved Hide resolved
@sifive-benjamin-morse sifive-benjamin-morse changed the base branch from main to bmorse/main February 21, 2023 23:38
timsnyder-siv and others added 21 commits February 21, 2023 21:57
use fsspec to enable xclbin's to be one of any URI protocol
supported by the library or an installed add-on
it is the easy and safe thing to do as a first implementation. can revisit FileCache later
refactor `localize_xclbin` from the Vitis runner to `util.io.downloadURI()` and add some pytests
Co-authored-by: Filip Stamenkovic <92741622+filipstamenkovic-sifive@users.noreply.github.com>
awstools.py now depends on it via util.io
Co-authored-by: Abraham Gonzalez <abe.j.gonza@gmail.com>
* throw better exceptions when URI don't exist
* remove forced check upgront for file:// URI to exist as this will be annoying when any neighbor hwdb entry could prevent runs
* fix checking if server_hardware_config is a string or not
*

 Please enter the commit message for your changes. Lines starting
Copy link
Collaborator

@timsnyder-siv timsnyder-siv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing this toward the finish line. I only saw a couple of things I had questions or concerns about.

deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Show resolved Hide resolved
deploy/runtools/run_farm_deploy_managers.py Outdated Show resolved Hide resolved
deploy/tests/test_utils.py Outdated Show resolved Hide resolved
deploy/tests/test_utils.py Outdated Show resolved Hide resolved
deploy/tests/test_utils.py Outdated Show resolved Hide resolved
scripts/machine-launch-script.sh Show resolved Hide resolved
Copy link
Contributor

@russell-horvath russell-horvath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small thing in comment above*

@abejgonzalez
Copy link
Contributor

Uncaught 2 FPGA instance shutdown(s) detected for CI run: 4296740217. Verify CI state before submitting PR.

Benjamin Morse and others added 3 commits March 1, 2023 01:33
Copy link
Collaborator

@timsnyder-siv timsnyder-siv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for getting this really implemented @sifive-benjamin-morse

for attempt in range(retries):
assert tries > 0, "tries argument must be larger than 0"
for attempt in range(tries):
rootLogger.debug(f"Download attempt {attempt+1} of {tries}: '{uri}' to '{lpath}'")
try:
fs, rpath = url_to_fs(uri)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want the url_to_fs() call in the try block? It seems like exceptions arising here may be indicative of a larger issue that shouldn't be retried?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For background, these two runs:

fail for two different reasons, but on the same line:

  • aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host firesim-ci-vitis-xclbins.s3.***.amazonaws.com:443 ssl:default [Name or service not known]
  • aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host firesim-ci-vitis-xclbins.s3.***.amazonaws.com:443 ssl:default [Temporary failure in name resolution]

I added url_to_fs() inside the block because I wasn't sure if it did the DNS resolution or not. I will move it out. Thanks for reviewing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog:added Put PR title in 'Added' section of changelog ci:fpga-deploy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants