New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GCP step logging #2533
base: develop
Are you sure you want to change the base?
Fix GCP step logging #2533
Conversation
Important Auto Review SkippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the To trigger a single review, invoke the WalkthroughThe update introduces an enhancement to the logging mechanism within the Changes
Related issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
@coderabbitai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adtygan there's a linting error that needs fixing |
@strickvl , I ran the below commands before I made the PR
After that I ran
For For I am attaching the log file for |
https://github.com/zenml-io/zenml/actions/runs/8307099335/job/22735937951?pr=2533#step:5:27 this is the specific thing you need to fix first of all. |
2197305
to
cd34ce4
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
It seems like maybe you're using an old version of the |
cd34ce4
to
4e02b80
Compare
I figured out that I was working on a fork of the repo so my |
There are quite a few updated files in here that shouldn't need to be updated. Not sure what happened on your side or with the fork, but since the changes are small, it might be worth resetting this branch somehow so that only the changes you made are reflected in the changed files? |
I only made a change to the |
You could try that, but the point is that the other previously-modified
files will still have been previously committed. have you made sure to
installed / upgraded the latest version of the dev tools? Probably if you
do `pip install -U ruff` that would be enough, since that's the package
that's modifying / formatting files...
…On Mon, 25 Mar 2024 at 10:30, Aditya Ganesh Kumar ***@***.***> wrote:
I only made a change to the logging/step_logging.py file. But when I run
the format.sh script it modifies a lot of other files. Can I run the
script on only the login file and push a commit? I think that should fix
the issue.
—
Reply to this email directly, view it on GitHub
<#2533 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZRNJQYLQ2ZEIVVOF6DSWTYZ7VDFAVCNFSM6AAAAABEZCOQNWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJXGU3DSNZRHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I upgraded |
Yeah the updates to the important formatting packages will only come when you update |
6d5c9a0
to
40bded9
Compare
I had already run |
May I please know if there is any action required from my end to solve this issue? |
@bcdurak is looking into it / at it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @adtygan, sorry about the late response regarding this issue.
I have checked the original problem and the implementation in the PR again and there are a few points that I wanted to mention here:
- Currently, the logic around how ZenML stores the log messages works as follows:
- Whenever we start the execution of a step, we create the proper context manager (
StepLogsStorageContext
) based on the properties of this step and a unique identifier. This process is conducted only once per step execution. - During the execution, ZenML queues up the incoming log messages in a buffer. If the buffer reaches a certain size or a certain amount of time has passed, we empty it by saving the messages in the designated location using the active artifact store.
- Whenever we start the execution of a step, we create the proper context manager (
The problem arises due to the immutability of the designated file in GCS as you mentioned above. While this PR fixes the problem to some degree, unfortunately, this does not propose a full solution.
-
If I am not missing anything, in the current version of the implementation, it is redundant to read any existing log messages (and add them to the buffer) as it is impossible to for ZenML create a context manager which points to an already existing log file. As I have mentioned, this process is done once per step execution by using a unique identifier.
-
The current solution works because it removes the first threshold of the buffer by setting the
max_messages
to a high enough value. However, you would run into the same issue if the buffer is cleared not due to the size but due to the time limit. For instance, if you add atime.sleep(20)
in between two log messages, you will run into the same issue. -
Finally, the current changes impact the functionality of the entire implementation and deem the
max_messages
parameter useless unless it is explicitly defined. I would perhaps propose a solution where a patch is applied only if the current artifact store is a GCP artifact store.
Thank you once again for working on this issue. The logs storage is an important topic and we are actively looking for different ways to improve our implementation, so your contribution is much appreciated :)
Thank you for your review @bcdurak . I agree with the points you have mentioned above. If I am allowed to propose a GCP specific solution, I have 2 approaches in mind.
Which of the 2 routes do you suggest? Thanks |
linked to #2366 (closed in favor of this one) |
Hey @adtygan, I think the second approach would be better here if possible. The first one is still viable but I would only apply it if the active artifact store is a GCS artifact store. As I mentioned before, logging has been an integral part of ZenML and we will be working on improving the user experience when it comes to handling these log messages. @safoinme is currently taking a look at our implementation. He can also share any insights and recommendations he comes across. |
Describe changes
The issue arises because GCS artifacts are immutable Stack Overflow thread. To fix the issue, I suggest a fix based on @strickvl 's idea mentioned below
Alex Strick van Linschoten's Idea
Consider using the logging.StreamHandler facility to temporarily write logs to the remote file (GCS, S3, etc.). Here's an example:
This approach could fit nicely in the StepLogsStorageContext class.
My Solution
Set buffer size to
sys.maxsize
and copy the existing contents of the log file to the buffer. This prevents the triggering of the part of code that appends buffer contents to log file once buffer is full and flushes buffer. (reference)Code to test the change:
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes
Summary by CodeRabbit
StepLogs
to support a maximum message limit and improved handling of log file reading errors.