Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remote tensorflow profiling to training jobs. #519

Closed
mkovalski opened this issue Jun 30, 2021 · 0 comments · Fixed by #651, #704 or #828
Closed

Add remote tensorflow profiling to training jobs. #519

mkovalski opened this issue Jun 30, 2021 · 0 comments · Fixed by #651, #704 or #828
Assignees
Labels
api: aiplatform Issues related to the AI Platform API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@mkovalski
Copy link
Contributor

To support on demand profiling of tensorflow , add handlers for profiling requests when running training jobs.

  • User updates their code to import the training util for profiling.
  • User submits a training job to Vertex AI.
  • Through Vertex AI Tensorboard, user can profile their remote training job through the "Capture Profile" functionality in Tensorboard.
@product-auto-label product-auto-label bot added the api: aiplatform Issues related to the AI Platform API. label Jun 30, 2021
@ooriana ooriana added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jul 1, 2021
gcf-merge-on-green bot pushed a commit that referenced this issue Sep 2, 2021
…erent event types (#651)

Refactoring for tensorboard uploader so that it can use additional senders, which do not process typical event files, in the future. This is some initial work before adding the tensorboard profiler event sender, and this should have no impact to current functionality. 

Fixes #519
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment