Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for logging formatter class #132

Closed
keegan2149 opened this issue Dec 13, 2020 · 15 comments
Closed

Support for logging formatter class #132

keegan2149 opened this issue Dec 13, 2020 · 15 comments
Assignees
Labels
api: logging Issues related to the googleapis/python-logging API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Milestone

Comments

@keegan2149
Copy link

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Is your feature request related to a problem? Please describe.
N/A
Describe the solution you'd like
Support for python logging formatter class
Describe alternatives you've considered
I would like to add the function name and line number to the log entires. The only alternative is to add formatting to each log entry.
Additional context

@product-auto-label product-auto-label bot added the api: logging Issues related to the googleapis/python-logging API. label Dec 13, 2020
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Dec 14, 2020
@0xSage 0xSage added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. priority: p3 Desirable enhancement or fix. May not be included in next release. and removed triage me I really want to be triaged. labels Dec 14, 2020
@daniel-sanche
Copy link
Contributor

daniel-sanche commented Dec 14, 2020

I'll look into the best way to implement this likely next quarter. Although we will likely add automatic source_entry detection to the handlers first, which seems like a more direct solution to your use case

@willjhenry
Copy link

willjhenry commented Dec 18, 2020

Does this issue apply to setting a formatter string, for example:

from google.cloud import logging as cloudlogging
lg_client = cloudlogging.Client()
lg_handler = lg_client.get_default_handler()
lg_handler.setLevel(logging_level)
formatter = logging.Formatter('%(levelname)s - %(funcName)s  - %(lineno)d - %(message)s')
lg_handler.setFormatter(formatter)

This works when I run it locally. The log message is successfully sent to my google cloud logging logs, with the correct format. However, when I run the code live (in a cloud function) no formatting is applied.

@willjhenry
Copy link

Never mind, I was looking at the logs generated by the default integration of cloud functions and the python logging module. The logs generated by this client library were formatted fine. (Now I need to figure out how to disable the default logging so I can just use this client library and not have duplicate logs)

@daniel-sanche daniel-sanche added this to the v2.5.0 milestone Jan 27, 2021
@MadJlzz
Copy link

MadJlzz commented Mar 19, 2021

I am not sure if this issue reflects what I am thinking but it would be great to be able to get the formatter class or the default handler independently from the Client().

As far as I understood, creating a Client() creates a connection (gRPC?) that I think is an overhead.

In fact, the documentation states that for some services (like Cloud Functions or AppEngine), log entries are automatically parsed by Cloud Logging from stdout and stderr if they follow this structure.

It might save some ressources for applications to avoid sending requests for logging as LogEntries are already parsed by Cloud Logging.

Because I am deploying to other cloud provider as well ; I would be able then to format my logs in a certain way whether I am using GCP or not.

Does that make sense?

@daniel-sanche
Copy link
Contributor

@MadJlzz good timing, that's exactly what I've been working on this week: #228

Starting with the next release, Cloud Functions and Cloud Run should print JSON logs directly to standard output while other environments will continue to use gRPC/HTTP clients to send logs. (Of course this applies when using the automatic logging integration, you will still be able to instantiate different handlers directly if desired)

Because I am deploying to other cloud provider as well ; I would be able then to format my logs in a certain way whether I am using GCP or not.

Can you explain this part a bit more? Do you want your logs on other cloud providers to show up in GCP Cloud Logging? Or are you just interested in using the same JSON format without sending to GCP?

@MadJlzz
Copy link

MadJlzz commented Mar 19, 2021

Perfect timing indeed! The fact that we won't need to write our own formatter is a good news 👍

AppEngine will not be concerned? It's also using the same pattern of reading stdout and stderr, no?

I don't know how the library will be exposed but I foresee something like this:

import logging

from google.cloud.logging import CloudLoggingHandler

# set up logging by overriding default handlers of Python logging module.
logging.basicConfig(level=logging.DEBUG)

# add the handler to the root logger
logging.getLogger('').addHandler(CloudLoggingHandler())

Can you explain this part a bit more? Do you want your logs on other cloud providers to show up in GCP Cloud Logging? Or are you just interested in using the same JSON format without sending to GCP?

It's more for being able to adapt the format depending on the platform we deploy!

  1. Reading JSON logs when developing locally is really painfull so we would like to use a huma redeable format.

  2. On-premise people are usually using their own logs aggregator that may filter log entries if not properly formatted...

  3. On other Cloud providers, they might use another format to automatically aggregate logs from their services.

The upper code block would then look like this:

import logging

from google.cloud.logging import CloudLoggingHandler

# set up logging by overriding default handlers of Python logging module.
logging.basicConfig(level=logging.DEBUG)

root_logger = logging.getLogger('')

# our context module knows if we are on GCP, AWS, Azure, On-premise or even locally.
# it appends automatically to the given logger the propper handler.
context.addLoggingHandler(root_logger)

# and then in each .py file
logger = logging.getLogger(__name__)

# log record properly formatted, thanks to the handlers ;)
logger.error('uh oh, something really bad happend...')

@daniel-sanche
Copy link
Contributor

AppEngine will not be concerned? It's also using the same pattern of reading stdout and stderr, no?

App Engine will stay as-is for now. We want to avoid making major breaking changes that may impact existing workflows, and App Engine doesn't show some of the problems other serverless environments have

I don't know how the library will be exposed but I foresee something like this:

This is how the library is currently set-up:

import google.cloud.logging
import logging

client = google.cloud.logging.Client()
client.setup_logging()
logging.warning("Hello world")

Reading JSON logs when developing locally is really painfull so we would like to use a huma redeable format.

you shouldn't be seeing JSON logs when developing locally. The library should detect what GCP environment you're on, and only output JSON when on GKE, GCF, or Cloud Run.

@MadJlzz
Copy link

MadJlzz commented Mar 23, 2021

App Engine will stay as-is for now. We want to avoid making major breaking changes that may impact existing workflows, and App Engine doesn't show some of the problems other serverless environments have

Ok, I understand.

Because of:

you shouldn't be seeing JSON logs when developing locally. The library should detect what GCP environment you're on, and only output JSON when on GKE, GCF, or Cloud Run.

Initializing a google.cloud.logging.Client() will not require any credentials if I am working locally then? (GOOGLE_APPLICATION_CREDENTIALS e.g.)

@keegan2149
Copy link
Author

So the only choices without having to install two or more modules, are json logs and unformatted logs? It seemed like a small change. I'm running three different libraries, one for my logs, one for google and a third for my analytics vendor. I think a lot of customers end up going the same way. It's your project I guess. 🤷🏾‍♂️

@daniel-sanche
Copy link
Contributor

daniel-sanche commented Mar 23, 2021

@MadJlzz

Initializing a google.cloud.logging.Client() will not require any credentials if I am working locally then? (GOOGLE_APPLICATION_CREDENTIALS e.g.)

If you're running your code locally, you likely will need GOOGLE_APPLICATION_CREDENTIALS. The library will fall-back to attempting to use gRPC to send logs to Cloud Logging, which will require authentication.

Are you saying you only want the library active when on a GCP environment? You could always check a flag in your code. If client.setup_logging() isn't run, your code will fall back to standard Python logging behaviour (although of course then you won't see logs in Cloud Logging)

import google.cloud.logging
import logging

if os.env('GCP_FLAG'):
  client = google.cloud.logging.Client()
  client.setup_logging()
logging.warning("Hello world")

If I'm misunderstanding what you're trying to do, let me know and I can try to help

@daniel-sanche
Copy link
Contributor

daniel-sanche commented Mar 23, 2021

@keegan2149

The changes you are asking for are being worked on, but it takes time. You should see file/line information start showing up in LogEntries in the next release. Full support for custom filters may show up at the same time, but may be the release after.

If you can tell me what GCP platform (or local/other) you're running your code on, I can try to write up a code snippet to accomplish the same thing now as well

@keegan2149
Copy link
Author

@daniel-sanche I'm still worried about performance. The code from my analytics vendor is slowing down my application. I don't want to make multiple network requests if I don't have to. I'm using app engine and GKE.

@daniel-sanche
Copy link
Contributor

The logs are batched and flushed in groups, so network calls may be less than you're expecting.

Still, since you're using environments that support JSON structured logging, you can force it to use the ContainerEngineHandler. This will just print logs into stdout in JSON format, which is parsed on the GCP backend. No added in-process latency:

from google.cloud.logging_v2.handlers import setup_logging

handler = ContainerEngineHandler()
setup_logging(handler)

logging.info("hello world")

@MadJlzz
Copy link

MadJlzz commented Mar 23, 2021

@daniel-sanche basically, you're example is what I am doing now.

I have been tricked because my app was using a really old version of google-cloud-logging that was confusing me. Particularly this.

The change you did for CF and Cloud Run to let the GCP backend process the logs is good enough for me.

Thanks!

@daniel-sanche
Copy link
Contributor

As of the v2.4.0 release, StructuredLogHandler and CloudLoggingHandler should capture source location information automatically. You can also manually set fields using the extra argument:

logging.warning("Hello world", extra={"source_location":{"lineno":1}})

This should work automatically on all environments except AppEngine and GKE, which will be updated to use the new handlers by default with the next v3.0.0 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the googleapis/python-logging API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

6 participants