Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support http_request field #120

Merged
merged 27 commits into from Dec 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
55383da
added http request to transports
daniel-sanche Dec 9, 2020
3c67803
got it working on gae flex
daniel-sanche Dec 9, 2020
4be2e19
refactored _helpers to support http_requests
daniel-sanche Dec 9, 2020
61115d1
added comments
daniel-sanche Dec 9, 2020
f7aa0e0
fixed helper tests
daniel-sanche Dec 9, 2020
1459f7e
changed order of outputs
daniel-sanche Dec 9, 2020
96cf896
improved test_django
daniel-sanche Dec 9, 2020
a8d54c8
added django http tests
daniel-sanche Dec 9, 2020
841e68a
added tests to flask
daniel-sanche Dec 9, 2020
a3bd27c
fixed appengine tests
daniel-sanche Dec 9, 2020
e8733f4
fixed transport tests
daniel-sanche Dec 10, 2020
2f73049
ran blacken
daniel-sanche Dec 10, 2020
e6ccd30
fixed lint issue
daniel-sanche Dec 10, 2020
b653b22
fixed trace path
daniel-sanche Dec 10, 2020
6ca954e
added protocol to http request
daniel-sanche Dec 11, 2020
16e7888
added supported frameworks
daniel-sanche Dec 11, 2020
36a71ab
added tests for protocol
daniel-sanche Dec 11, 2020
6b549bd
removed unneded assertions
daniel-sanche Dec 11, 2020
e91548c
renamed test function
daniel-sanche Dec 11, 2020
5f9d932
added tests for sparse http requests
daniel-sanche Dec 11, 2020
69c0d39
added comment
daniel-sanche Dec 11, 2020
5bb9e51
blackened
daniel-sanche Dec 11, 2020
6d1974d
use proto instead of dict for http_request
daniel-sanche Dec 11, 2020
f0f5334
added test for when libraries aren't imported
daniel-sanche Dec 12, 2020
fe9104c
use kwargs in transports instead of listing all LogEntry fields
daniel-sanche Dec 14, 2020
95ceb66
blackened
daniel-sanche Dec 14, 2020
399cbf7
fixed docs suggestions
daniel-sanche Dec 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
88 changes: 60 additions & 28 deletions google/cloud/logging_v2/handlers/_helpers.py
Expand Up @@ -23,9 +23,14 @@
flask = None

from google.cloud.logging_v2.handlers.middleware.request import _get_django_request
from google.logging.type.http_request_pb2 import HttpRequest

_DJANGO_TRACE_HEADER = "HTTP_X_CLOUD_TRACE_CONTEXT"
_DJANGO_USERAGENT_HEADER = "HTTP_USER_AGENT"
_DJANGO_REMOTE_ADDR_HEADER = "REMOTE_ADDR"
_DJANGO_REFERER_HEADER = "HTTP_REFERER"
_FLASK_TRACE_HEADER = "X_CLOUD_TRACE_CONTEXT"
_PROTOCOL_HEADER = "SERVER_PROTOCOL"


def format_stackdriver_json(record, message):
Expand All @@ -46,59 +51,86 @@ def format_stackdriver_json(record, message):
return json.dumps(payload)


def get_trace_id_from_flask():
"""Get trace_id from flask request headers.
def get_request_data_from_flask():
"""Get http_request and trace data from flask request headers.

Returns:
str: TraceID in HTTP request headers.
Tuple[Optional[google.logging.type.http_request_pb2.HttpRequest], Optional[str]]:
Data related to the current http request and the trace_id for the
request. Both fields will be None if a flask request isn't found.
"""
if flask is None or not flask.request:
return None
return None, None

# build http_request
http_request = HttpRequest(
request_method=flask.request.method,
request_url=flask.request.url,
request_size=flask.request.content_length,
user_agent=flask.request.user_agent.string,
remote_ip=flask.request.remote_addr,
referer=flask.request.referrer,
protocol=flask.request.environ.get(_PROTOCOL_HEADER),
)

# find trace id
trace_id = None
header = flask.request.headers.get(_FLASK_TRACE_HEADER)
if header:
trace_id = header.split("/", 1)[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future edge case to take care of:

Given user input is X-Cloud-Trace-Context: TRACE_ID/SPAN_ID;o=TRACE_TRUE where any of the values can be nil or 0. This logic to split by / and grab the first value might give you span id instead.

Seems super niche edge case, but I actually got a similar ticket for this in Go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure I understand. Can you have a span without a trace? Do you have a ticket or doc I can look over?

Either way, I'm not a huge fan of this parsing logic. I'm hoping to fix it up when I add support for spans soon

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be possible to have a trace as null but span as non-null. Curious how this was a case in Go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The edge cases I'm referring to here is just that the X-Cloud-Trace-Context field is provided by the user.

So the user could manually write anything here. e.g. '/hello/world' and we'd grep the first token as trace_id. Adding better regex later on could help.

This was the original Go issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where we tell users they can manually write this header: https://cloud.google.com/trace/docs/setup#force-trace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm good to keep in mind, yeah it might be helpful to throw an exception or something if the header doesn't match the expected format

I'll probably fix that in a later PR when I add support for spans though. That's when proper parsing will be more important. For now, this is the logic that was already in place


if header is None:
return None

trace_id = header.split("/", 1)[0]

return trace_id
return http_request, trace_id


def get_trace_id_from_django():
"""Get trace_id from django request headers.
def get_request_data_from_django():
"""Get http_request and trace data from django request headers.

Returns:
str: TraceID in HTTP request headers.
Tuple[Optional[google.logging.type.http_request_pb2.HttpRequest], Optional[str]]:
Data related to the current http request and the trace_id for the
request. Both fields will be None if a django request isn't found.
"""
request = _get_django_request()

if request is None:
return None
return None, None
# build http_request
http_request = HttpRequest(
request_method=request.method,
request_url=request.build_absolute_uri(),
request_size=len(request.body),
user_agent=request.META.get(_DJANGO_USERAGENT_HEADER),
remote_ip=request.META.get(_DJANGO_REMOTE_ADDR_HEADER),
referer=request.META.get(_DJANGO_REFERER_HEADER),
protocol=request.META.get(_PROTOCOL_HEADER),
)

# find trace id
trace_id = None
header = request.META.get(_DJANGO_TRACE_HEADER)
if header is None:
return None

trace_id = header.split("/", 1)[0]
if header:
trace_id = header.split("/", 1)[0]

return trace_id
return http_request, trace_id


def get_trace_id():
"""Helper to get trace_id from web application request header.
def get_request_data():
"""Helper to get http_request and trace data from supported web
frameworks (currently supported: Flask and Django).

Returns:
str: TraceID in HTTP request headers.
Tuple[Optional[google.logging.type.http_request_pb2.HttpRequest], Optional[str]]:
Data related to the current http request and the trace_id for the
request. Both fields will be None if a supported web request isn't found.
"""
checkers = (
get_trace_id_from_django,
get_trace_id_from_flask,
get_request_data_from_django,
get_request_data_from_flask,
)

for checker in checkers:
trace_id = checker()
if trace_id is not None:
return trace_id
http_request, trace_id = checker()
if http_request is not None:
return http_request, trace_id

return None
return None, None
19 changes: 11 additions & 8 deletions google/cloud/logging_v2/handlers/app_engine.py
Expand Up @@ -21,7 +21,7 @@
import logging
import os

from google.cloud.logging_v2.handlers._helpers import get_trace_id
from google.cloud.logging_v2.handlers._helpers import get_request_data
from google.cloud.logging_v2.handlers.transports import BackgroundThreadTransport
from google.cloud.logging_v2.resource import Resource

Expand Down Expand Up @@ -96,7 +96,7 @@ def get_gae_labels(self):
"""
gae_labels = {}

trace_id = get_trace_id()
_, trace_id = get_request_data()
if trace_id is not None:
gae_labels[_TRACE_ID_LABEL] = trace_id

Expand All @@ -114,11 +114,14 @@ def emit(self, record):
"""
message = super(AppEngineHandler, self).format(record)
gae_labels = self.get_gae_labels()
trace_id = (
"projects/%s/traces/%s" % (self.project_id, gae_labels[_TRACE_ID_LABEL])
if _TRACE_ID_LABEL in gae_labels
else None
)
http_request, trace_id = get_request_data()
if trace_id is not None:
trace_id = f"projects/{self.project_id}/traces/{trace_id}"
self.transport.send(
record, message, resource=self.resource, labels=gae_labels, trace=trace_id
record,
message,
resource=self.resource,
labels=gae_labels,
trace=trace_id,
http_request=http_request,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you're using http_pb2_request does that work properly with the LogEntry?

http_request = resource.get("httpRequest")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should. This is the type used by the underlying LogEntry proto, so I assumed it would be preferred:

http_request = proto.Field(

But I'm still getting used to working with gapic code. Do you prefer to just use dictionaries for these kinds of fields?

)
36 changes: 6 additions & 30 deletions google/cloud/logging_v2/handlers/transports/background_thread.py
Expand Up @@ -222,31 +222,21 @@ def _main_thread_terminated(self):
file=sys.stderr,
)

def enqueue(
self, record, message, *, resource=None, labels=None, trace=None, span_id=None
):
def enqueue(self, record, message, **kwargs):
"""Queues a log entry to be written by the background thread.

Args:
record (logging.LogRecord): Python log record that the handler was called with.
message (str): The message from the ``LogRecord`` after being
formatted by the associated log formatters.
resource (Optional[google.cloud.logging_v2.resource.Resource]):
Monitored resource of the entry
labels (Optional[dict]): Mapping of labels for the entry.
trace (Optional[str]): TraceID to apply to the logging entry.
span_id (Optional[str]): Span_id within the trace for the log entry.
Specify the trace parameter if span_id is set.
kwargs: Additional optional arguments for the logger
"""
queue_entry = {
"info": {"message": message, "python_logger": record.name},
"severity": _helpers._normalize_severity(record.levelno),
"resource": resource,
"labels": labels,
"trace": trace,
"span_id": span_id,
"timestamp": datetime.datetime.utcfromtimestamp(record.created),
}
queue_entry.update(kwargs)
self._queue.put_nowait(queue_entry)

def flush(self):
Expand Down Expand Up @@ -291,30 +281,16 @@ def __init__(
)
self.worker.start()

def send(
self, record, message, resource=None, labels=None, trace=None, span_id=None
):
def send(self, record, message, **kwargs):
"""Overrides Transport.send().

Args:
record (logging.LogRecord): Python log record that the handler was called with.
message (str): The message from the ``LogRecord`` after being
formatted by the associated log formatters.
resource (Optional[google.cloud.logging_v2.resource.Resource]):
Monitored resource of the entry.
labels (Optional[dict]): Mapping of labels for the entry.
trace (Optional[str]): TraceID to apply to the logging entry.
span_id (Optional[str]): span_id within the trace for the log entry.
Specify the trace parameter if span_id is set.
kwargs: Additional optional arguments for the logger
"""
self.worker.enqueue(
record,
message,
resource=resource,
labels=labels,
trace=trace,
span_id=span_id,
)
self.worker.enqueue(record, message, **kwargs)

def flush(self):
"""Submit any pending log records."""
Expand Down
8 changes: 2 additions & 6 deletions google/cloud/logging_v2/handlers/transports/base.py
Expand Up @@ -22,18 +22,14 @@ class Transport(object):
client and name object, and must override :meth:`send`.
"""

def send(
self, record, message, *, resource=None, labels=None, trace=None, span_id=None
):
def send(self, record, message, **kwargs):
"""Transport send to be implemented by subclasses.

Args:
record (logging.LogRecord): Python log record that the handler was called with.
message (str): The message from the ``LogRecord`` after being
formatted by the associated log formatters.
resource (Optional[google.cloud.logging_v2.resource.Resource]):
Monitored resource of the entry.
labels (Optional[dict]): Mapping of labels for the entry.
kwargs: Additional optional arguments for the logger
"""
raise NotImplementedError

Expand Down
15 changes: 3 additions & 12 deletions google/cloud/logging_v2/handlers/transports/sync.py
Expand Up @@ -30,26 +30,17 @@ class SyncTransport(Transport):
def __init__(self, client, name):
self.logger = client.logger(name)

def send(
self, record, message, *, resource=None, labels=None, trace=None, span_id=None
):
def send(self, record, message, **kwargs):
"""Overrides transport.send().

Args:
record (logging.LogRecord):
Python log record that the handler was called with.
message (str): The message from the ``LogRecord`` after being
formatted by the associated log formatters.
resource (Optional[~logging_v2.resource.Resource]):
Monitored resource of the entry.
labels (Optional[dict]): Mapping of labels for the entry.
kwargs: Additional optional arguments for the logger
"""
info = {"message": message, "python_logger": record.name}
self.logger.log_struct(
info,
severity=_helpers._normalize_severity(record.levelno),
resource=resource,
labels=labels,
trace=trace,
span_id=span_id,
info, severity=_helpers._normalize_severity(record.levelno), **kwargs,
)