Skip to content
This repository has been archived by the owner on Apr 3, 2019. It is now read-only.

First pass metrics logging #372

Closed
kparlante opened this issue Dec 4, 2013 · 11 comments
Closed

First pass metrics logging #372

kparlante opened this issue Dec 4, 2013 · 11 comments
Assignees
Labels

Comments

@kparlante
Copy link

@dannycoates : Related to #349 and #351, first set of events to log for end to end testing.

Strawman proposal, written as json:

"event": {
    "eventName": account_create_started,
    "timestamp": 1373575200000,
    "startOffset":4777,
    "service":"FTU",
    "flow":"CreateAccount",
    "device":"FxOS 18",
    "locale":en
  }
  • Logged at the point the account is successfully created:
"event": {
    "eventName": account_created,
    "timestamp": 1373575200000,
    "startOffset":4777,
    "service":"FTU",
    "flow":"CreateAccount",
    "device":"FxOS 18",
    "locale":en
  }
"event": {
    "eventName": device_attach_started,
    "timestamp": 1373575200000,
    "startOffset":4777,
    "service":"FTU",
    "flow":"CreateAccount",
    "device":"FxOS 18",
    "locale":en
  }
  • Logged at the point the device is successfully attached, flow could be either "CreateAccount" or "AttachDevice":
"event": {
    "eventName": device_attached,
    "timestamp": 1373575200000,
    "startOffset":4777,
    "service":"FTU",
    "flow":"CreateAccount",
    "device":"FxOS 18",
    "locale":en
  }
"event": {
    "eventName": certificate_signed,
    "timestamp": 1386191112,
    "startOffset":4777,
    "service":"FTU",
    "flow":"DeviceAttached",
    "device":"FxOS 18",
    "locale":en
  }
@pdehaan
Copy link
Contributor

pdehaan commented Dec 10, 2013

@dannycoates did you want to take this (and give it a milestone)

@ghost ghost assigned dannycoates Dec 10, 2013
@ckarlof ckarlof modified the milestones: Feb 14, Q1-2014 (Mar 31) Feb 4, 2014
@ckarlof
Copy link
Contributor

ckarlof commented Feb 4, 2014

@kparlante is blocked on this.

@dannycoates
Copy link
Contributor

I'd like to solve this in a way that auth-server logic won't ever have to do anything special for whatever type of analysis we want to do in the future. We added log.security for security events which I've already begun to hate because keeping these lines of code correct is tedious and error-prone since they don't affect the actual flow of control. They're basically like comments. I'd rather not keep adding special logging for each thing that's interested in what the auth-server is doing.

I'd like for the auth-server to log one line at the end of each request with enough information that any "following" process can get the data it needs to do it's thing. Metrics is special because it needs additional data that the auth server itself doesn't care about (or even have). I'm cool with adding an optional field to the post json as an opaque pass-through field that will be logged in the request summary.

I'm pretty sure that each case a follower would be interested in is covered by the response we send to the client and hence the proposed summary log line. If not, we're missing error codes.

I'd like to try something like this first because if it works it should be easier for everyone. If it doesn't then I'll suck it up and add special logging :)

@dannycoates
Copy link
Contributor

#541

@ckarlof
Copy link
Contributor

ckarlof commented Feb 6, 2014

Go for it!

@kparlante
Copy link
Author

@dannycoates : fwiw, makes sense to me -- error prone metrics gathering that is out of sync with the actual flow of control has bedeviled us in the past. As long as the logging happens in the error cases (as you mentioned), this approach sounds more reliable.

@kparlante
Copy link
Author

@dannycoates : A few notes on the fields:

  1. "eventName": we can align the "eventName" exactly to the endpoint. Presumably whether or not the endpoint "succeeded" is also going to be logged. (Vs separate "device_attach_started" & "device_attach_succeeded" events as described above.)
  2. "device" can just be a pass through of the user agent.
  3. "service" can be passed on for endpoints where we have it
  4. "locale" can be a pass through of the locale.
  5. "flow" -- if the auth-server can infer this, or pass along information so that a heka filter can infer this, it would be great. (If not, we can pass it through from the client eventually, as you described above)
  6. We can ignore "startOffset" for now; again, pass through information

The fields are listed above in order of importance re: the user stories that are prioritized most highly

@dannycoates
Copy link
Contributor

To clarify one point that isn't obvious from my description, the summary line will contain richer data than is sent to the client. For example if we track failed auth attempt count per email it will appear in the summary but not in the error sent to the client.

The auth-server will probably log details that we don't want to keep in a log file directly (or at least archive) but are useful to perform aggregation on. I imagine heka could aggregate an anonymize the data for suitable long-term storage. My point being we probably shouldn't keep raw auth-server log files long-term.

@kparlante
Copy link
Author

Yes, I'm assuming that heka will aggregate, anonymize, and otherwise massage data into usefulness before it lands in elasticsearch. (And I'm working with trink to make that happen for the PM/UX related metrics data -- devops is working on getting us a dev environment with production data). That is where we can massage the user agent into device categories, for example. Agreed that we shouldn't be querying raw auth-server log files, or keeping them around long-term.

@seanmonstar
Copy link
Contributor

Just to toss in another option: I solved this in Persona using intel, such that we just logged whatever we wanted, and then configured handlers to listen for certain messages, formatted them, and then sent them to the proper destination (log file, kpiggybank, statsd, etc).

@dannycoates
Copy link
Contributor

Closed by #565

rfk pushed a commit that referenced this issue Oct 24, 2018
chore(awsbox): remove unused awsbox
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants