Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Added environment specific labels to client library when running in Cloud Run Jobs #877

Merged
merged 10 commits into from Apr 17, 2024

Conversation

gkevinzheng
Copy link
Contributor

Refactored code that adds trace_id label in App Engine to also add Cloud Run Jobs labels in the Cloud Run Jobs environment.

Fixes #867

@gkevinzheng gkevinzheng requested review from a team as code owners April 3, 2024 20:22
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: logging Issues related to the googleapis/python-logging API. labels Apr 3, 2024

def set_item(key, val):
if val:
additional_labels[key] = val
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems a bit unoptimized for something that has to be run on each log

Can we remove this inner function?

I assume the env vars aren't expected to change during runtime, right? If we can rely on that, maybe we can build the additional_labels dict once at init time, and attach it to each cloud_run_jobs log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it, we can build part of the additional_labels dict once at init time, but something like setting the trace_id label will have to be done per log record, as that label depends on what the value of _trace is in each LogRecord.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env vars aren't expected to change during runtime, but I've also refactored the GAE trace_id label code into this change as well. Does that change during runtime?

if resource.type == _GAE_RESOURCE_TYPE and record._trace is not None:
# add GAE-specific label
labels = {_GAE_TRACE_ID_LABEL: record._trace, **(labels or {})}
labels = {**add_environment_labels(resource, record), **(labels or {})} or None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double-check: That this is only for the CloudLoggingHandler, not the StructuredLogHandler. That's intended, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the Stackdriver daemon adds the proper labels.

def add_environment_labels_test_flow(self, resource_type, expected_labels):
resource = Resource(type=resource_type, labels={})
actual_labels = add_environment_labels(resource, self.record)
self.assertDictEqual(actual_labels, expected_labels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is just me, but I don't like this kind of indirection for assertions. I think this is one place where DRY doesn't apply. It feels to easy to accidently make all tests into no-ops and not notice, because the assert is separate from the test code

I prefer to have asserts in each test function (and not behind any conditionals if possible).

_monitored_resources._CLOUD_RUN_JOBS_TASK_INDEX_LABEL: test_task_index,
_monitored_resources._CLOUD_RUN_JOBS_TASK_ATTEMPT_LABEL: test_task_attempt,
},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also a test for other resource types?

Suggestion: This seems like it could be a good place for pytest.mark.paramaterize to really scale up different test cases:

@pytest.mark.paramaterize("env,record_fields,resource,expected_labels", [
    {}, {"_trace":"trace"}, "gae_app", {"appengine.googleapis.com/trace_id": "trace"},
   {"CLOUD_RUN_EXECUTION": "test_name"}, {}, "cloud_run_job", {"run.googleapis.com/execution_name": "test_name"}
  ...
])
def test_add_environment_labels(env, resource, expected_labels):
  ...

"""Builds a dictionary of labels to be inserted into a LogRecord of the given resource type.
This function should only build a dict of items that are consistent across multiple logging statements
of the same resource type, such as environment variables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add a comment about cache?

_CLOUD_RUN_EXECUTION_ID,
)
set_item(_CLOUD_RUN_JOBS_TASK_INDEX_LABEL, _CLOUD_RUN_TASK_INDEX)
set_item(_CLOUD_RUN_JOBS_TASK_ATTEMPT_LABEL, _CLOUD_RUN_TASK_ATTEMPT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this works, but I find the indirection a bit harder to read than it probably needs to be. Maybe something like:

var_map =  ((_CLOUD_RUN_EXECUTION_ID, _CLOUD_RUN_JOBS_EXECUTION_NAME_LABEL), (_CLOUD_RUN_TASK_INDEX, _CLOUD_RUN_JOBS_TASK_INDEX_LABEL), (_CLOUD_RUN_TASK_ATTEMPT, _CLOUD_RUN_JOBS_TASK_ATTEMPT_LABEL))

for envvar, label_name in var_map:
  value = os.environ.get(envvar, None):
  if value:
    labels[label_name] = value
  

@gkevinzheng gkevinzheng added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 16, 2024
@gkevinzheng gkevinzheng enabled auto-merge (squash) April 16, 2024 20:31
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 16, 2024
@gkevinzheng gkevinzheng added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 17, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 17, 2024
@gkevinzheng gkevinzheng added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 17, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 17, 2024
@gkevinzheng gkevinzheng merged commit 9c5e8f0 into main Apr 17, 2024
17 checks passed
@gkevinzheng gkevinzheng deleted the cloud-run-job-labels branch April 17, 2024 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the googleapis/python-logging API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

missing label run.googleapis.com/execution_name instantiation on cloudrun job environment
3 participants