Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Ragna API logging format customizable with sensible default #377

Open
peachkeel opened this issue Mar 26, 2024 · 14 comments
Open

Make Ragna API logging format customizable with sensible default #377

peachkeel opened this issue Mar 26, 2024 · 14 comments
Labels
type: enhancement 💅 New feature or request
Milestone

Comments

@peachkeel
Copy link
Contributor

peachkeel commented Mar 26, 2024

Feature description

Oftentimes, I'll start up Ragna like so to create a log:

ragna api 2>&1 | tee ragna.log

I can then grep the log to find errors or exceptions:

grep -i -P -B1 "ERROR:" ragna.log

The results might look something like this:

INFO:     127.0.0.1:56789 - "POST /chats/88d89303-3360-41b4-b405-3531ea884ff0/answer HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application

It would be better, though, if they looked something like this:

[Tue Mar 26 08:28:57 2024] INFO:     127.0.0.1:56789 - "POST /chats/88d89303-3360-41b4-b405-3531ea884ff0/answer HTTP/1.1" 500 Internal Server Error
[Tue Mar 26 08:28:58 2024] ERROR:    Exception in ASGI application

Value and/or benefit

Frequently, I'm looking for HTTP 429 Too Many Requests or similar errors having to do with rate limiting. Knowing when they occurred can be beneficial in rescheduling API calls, especially if a daily limit has been breached.

Yes, one could maybe look at the time the log file was last modified; however, that's a pretty fragile solution because any intentional or unintentional update to the log would result in the loss of that information.

Anything else?

cc @Tengal-Teemo

@peachkeel peachkeel added the type: enhancement 💅 New feature or request label Mar 26, 2024
@Tengal-Teemo
Copy link
Contributor

@pmeier @peachkeel It shall be done

@Tengal-Teemo
Copy link
Contributor

This feature has been somewhat implemented in this branch on @peachkeel ragna fork. It is currently trying to mimic Common Logging Format.
Example Log:
('http', '127.0.0.1', 31476) - INFO [28/Mar/2024:16:39:54 +1300] - Uvicorn running on http://127.0.0.1:31476 (Press CTRL+C to quit)

@pmeier
Copy link
Member

pmeier commented Mar 28, 2024

The important lines are

log_config = uvicorn.config.LOGGING_CONFIG
log_config["formatters"]["access"]["fmt"] = "%(args)s - %(levelname)s [%(asctime)s] %(message)s"
log_config["formatters"]["default"]["fmt"] = "%(args)s - %(levelname)s [%(asctime)s] %(message)s"
log_config["formatters"]["access"]["datefmt"] = "%d/%b/%Y:%H:%M:%S %z"
log_config["formatters"]["default"]["datefmt"] = "%d/%b/%Y:%H:%M:%S %z"
uvicorn.run(
    api_app(
        config=config, ignore_unavailable_components=ignore_unavailable_components
    ),
    host=config.api.hostname,
    port=config.api.port,
    log_config=log_config
)

A couple of questions:

  1. Do we need the %(args)s in there? Isn't that information encoded in the startup message as well as the configuration file used? That seems like quite some noise to log with every message
  2. What is the difference between the "access" and "default" formatters? "access" makes sense to me. Is "default" for all other messages?
  3. I'm not familiar with the common logging format, but it seems the example message above is missing a few parts of the spec. Should we maybe just add the timestamp as originally requested rather than calling this "common logging format"?
  4. I see the benefit of adding the timestamp by default. However, should we make this configurable? Maybe even make the format configurable?

@Tengal-Teemo
Copy link
Contributor

Tengal-Teemo commented Mar 28, 2024

  1. You are correct, args should be removed. It looks terrible and doesn't really add much, the only reason it was there was because I couldn't find the ip anywhere else in the record when I was trying to recreate CLF.
  2. I am not sure what is the difference between the access and default level formatters, I added both because I was matching what several people did online to edit the uvicorn logs.
  3. It's technically only missing bytes, as I believe request and status are replaced by message.

For now I will remove args and just add the timestamp like planned, but I'll see if I can find a way to add the host that doesn't look terrible.

@Tengal-Teemo
Copy link
Contributor

@pmeier if you check the most recent commit, I've implemented the CLF standard more or less properly. I now understand the difference between access and default, one is for the api calls and one is for anything else.
image

@Tengal-Teemo
Copy link
Contributor

The only difference is that we don't really have an ident parameter, so I've left it as "-", and I've added the logging level and replaced the bytes with the http status

@Tengal-Teemo
Copy link
Contributor

@pmeier Are you satisfied with this style of logging? I'm happy to make whatever changes to it.

@pmeier
Copy link
Member

pmeier commented Apr 5, 2024

I know you and @peachkeel are closely connected, but the original feature request was for adding timestamps. And I'm totally aboard with that.

Regarding the CLF, I'm not sure. I'm certainly not very experienced with observability yet, but so far I haven't seen any application use it. That is not to say that it isn't used.

Meaning, I would be more comfortable if we only went with just the timestamps for now and maybe tackle a user defined timestamp and logging format as proposed in 4. of #377 (comment) later. Thoughts?

@peachkeel
Copy link
Contributor Author

Hi, @pmeier. I'm a dinosaur, and I'm the one who suggested (offline) that maybe @Tengal-Teemo go with CLF since it has been recognized as a de facto standard:

https://www.rfc-editor.org/rfc/rfc6872.html

Such standardization brings benefits, especially in terms of reusing existing analytics tooling.

Of course, we don't have the bytes field as found in CLF, but that could be left empty with a "-" demarking it.

Anyway, I would support full customization with CLF as the default.

@pmeier
Copy link
Member

pmeier commented Apr 7, 2024

I'm a dinosaur

That could go at least two routes:

  1. "This is what we used back in my days. Not sure what is happening today"
  2. "Trust me, I've seen enough systems to confirm that this something that is generally used and understood".

From the phrasing of the rest of the comment, I assume you mean something along the lines of 2., but I want to make sure here.

Anyway, I would support full customization with CLF as the default.

Happy to. Let's pivot in that direction.

@pmeier pmeier changed the title Timestamps in Ragna logging output going to stdout Make Ragna API logging format customizable with sensible default Apr 7, 2024
@pmeier
Copy link
Member

pmeier commented Apr 7, 2024

One thing that I want to avoid, is to come up with a customization scheme that does not support common cases. One thing that I have seen a lot in the recent years is structured logging. I just did a quick search and found this blog that seems to suggest that it is not possible to do without actually injecting middleware into the application. @Tengal-Teemo could you make sure that this is the case?

If yes, we should be fine with just exposing the fmt and datefmt from #377 (comment) to the user. Otherwise, this might get more complicated.

@Tengal-Teemo
Copy link
Contributor

Tengal-Teemo commented Apr 8, 2024

@pmeier After looking into it, I believe that it depends what you mean by middleware. If you mean the BaseHTTPMiddleware they use, then no. Technically, you can just create a JsonFormatter as the blog does and attach that to the logging root before you call uvicorn_run(). The extra stuff they add can simply be obtained from the 'args' field of the record, and you could just check the record.name to see if you're looking at a record generated by uvicorn.access or not. However to do something like this yes you do need to override the default formatter used by uvicorn.

@Tengal-Teemo
Copy link
Contributor

If we do want to expose the logs to the user, it might be better to expose a log_config.yaml file that allows you to specify the fmt and datefmt, like below which I amended from this thread. However, this isn't the most workable framework for things like structured logging, as you're pretty much required to work within the formatters used by uvicorn if you use this method.

version: 1
disable_existing_loggers: False
formatters:
  default:
    "()": uvicorn.logging.DefaultFormatter
    format: 'localhost:31477 - %(name)s [%(asctime)s] %(levelprefix)s %(message)s'
    datefmt: '%d/%b/%Y:%H:%M:%S %z'
  access:
    "()": uvicorn.logging.AccessFormatter
    format: '%(client_addr)s - %(name)s [%(asctime)s] %(levelprefix)s %(request_line)s %(status_code)s'
    datefmt: '%d/%b/%Y:%H:%M:%S %z'
handlers:
  default:
    formatter: default
    class: logging.StreamHandler
    stream: ext://sys.stderr
  access:
    formatter: access
    class: logging.StreamHandler
    stream: ext://sys.stdout
loggers:
  uvicorn.error:
    level: INFO
    handlers:
      - default
    propagate: no
  uvicorn.access:
    level: INFO
    handlers:
      - access
    propagate: no

@pmeier pmeier added this to the 0.3.0 milestone Apr 24, 2024
@pmeier
Copy link
Member

pmeier commented Apr 29, 2024

Ok, let's just roll with CLF for now and worry about this later. I prefer to make this configurable by the user, but having a more reasonable by default is higher prio now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement 💅 New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants