Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Realtime Query Logging without Database #1433

Open
starsoccer opened this issue Apr 5, 2024 · 27 comments
Open

Feature: Realtime Query Logging without Database #1433

starsoccer opened this issue Apr 5, 2024 · 27 comments
Labels
🔨 enhancement New feature or request
Milestone

Comments

@starsoccer
Copy link

starsoccer commented Apr 5, 2024

Currently its possible to push metrics from blocky into grafana via Prometheus. And while this works fine, its currently not possible to also have the query log in grafana without making use of another database such as mysql, or postgres.

Edit: It would be nice if there was a way to instead push the query log to influxdb, or alternatively, expose it somehow so it can be grabbed by telegraf and pushed into influx. It would also be nice if all metrics could be pushed into influxdb natively or via telegraf so there isnt a need to run prometheus and/or telegraf and/or grafana.

@kwitsch kwitsch added the 🔨 enhancement New feature or request label Apr 7, 2024
@kwitsch kwitsch added this to the future milestone Apr 7, 2024
@kwitsch
Copy link
Collaborator

kwitsch commented Apr 7, 2024

Sounds reasonable.

I even started implementing it a long time ago but didn't finish it since I stopped using InfluxDB... 🤔

@starsoccer
Copy link
Author

Awesome, happy to test a branch or help however I can.

I run influxdb(1.8) for a few other things, so Id prefer to avoid having to run another DB just for query logging. It would be great to not need to run prometheus either, but one step at a time 😉

@ezekieldas
Copy link

I can test with influxdb2.

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 19, 2024

Ok I looked into it again and it seems that for each InfluxDB version a different client is required. O.o

I'm a bit taken back by this.

Cost vs benefits of implementing 3 clients for 1 database type is quite unreasonable...

@starsoccer
Copy link
Author

Ok I looked into it again and it seems that for each InfluxDB version a different client is required. O.o

I'm a bit taken back by this.

Cost vs benefits of implementing 3 clients for 1 database type is quite unreasonable...

That is annoying(re different clients). I dont know a huge amount about Prometheus vs Telegraf, but I know blocky currently supports Prometheus, is there no way to expose this information via Prometheus? And then on the flip side what about doing so via Telegraf?

@0xERR0R
Copy link
Owner

0xERR0R commented Apr 19, 2024

Maybe we should externalize the query log into a separate service via gRPC? We can define streaming contract and provide implementation for postgres/mariadb/csv. People can create own implementation, since grpc is platform agnostic, it could be done in different language (for example java)?

@ezekieldas
Copy link

ezekieldas commented Apr 19, 2024

edit: totally overlooked the request here is for query logging. Leaving this here regarless in case someone searches for metrics.


This would require use of telegraf, but not necessarily on the same host as blocky. This basically amounts to a scraper (via telegraf) with an output to influx.

https://docs.influxdata.com/influxdb/cloud/write-data/developer-tools/scrape-prometheus-metrics/

https://github.com/influxdata/telegraf/blob/release-1.30/plugins/inputs/prometheus/README.md

# cat prom_blocky.conf
[[inputs.prometheus]]
  urls = ["http://blocky.somewhere.io:9110"]
# sudo -u telegraf telegraf --test --config prom_blocky.conf

2024-04-19T19:29:43Z I! Loading config: prom_blocky.conf
2024-04-19T19:29:43Z I! Starting Telegraf 1.30.1 brought to you by InfluxData the makers of InfluxDB
2024-04-19T19:29:43Z I! Available plugins: 233 inputs, 9 aggregators, 31 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-04-19T19:29:43Z I! Loaded inputs: prometheus
2024-04-19T19:29:43Z I! Loaded aggregators:
2024-04-19T19:29:43Z I! Loaded processors:
2024-04-19T19:29:43Z I! Loaded secretstores:
2024-04-19T19:29:43Z W! Outputs are not used in testing mode!
2024-04-19T19:29:43Z I! Tags enabled: host=sfo001
2024-04-19T19:29:43Z I! [inputs.prometheus] Using the label selector:  and field selector:
> blocky_blacklist_cache,group=ads,host=florax,url=http://blocky.somewhere.io:9110/metrics gauge=183177 1713554984000000000
> blocky_blocking_enabled,host=florax,url=http://blocky.somewhere.io:9110/metrics gauge=1 1713554984000000000
> blocky_build_info,build_time=20240106-205224,host=florax,url=http://blocky.somewhere.io:9110/metrics,version=v0.23 gauge=1 1713554984000000000
[ ... ]

@starsoccer
Copy link
Author

Maybe we should externalize the query log into a separate service via gRPC? We can define streaming contract and provide implementation for postgres/mariadb/csv. People can create own implementation, since grpc is platform agnostic, it could be done in different language (for example java)?

I think doing something like that may be a bit overkill honestly. Honestly from my POV Id be happy with just a log file that contains all the requests in close to/near real time. The issue for me is really that Id like to avoid running a database, but using csv for an output means the data will be stale by at most 1 day.

@starsoccer
Copy link
Author

This would require use of telegraf, but not necessarily on the same host as blocky. This basically amounts to a scraper (via telegraf) with an output to influx.

https://docs.influxdata.com/influxdb/cloud/write-data/developer-tools/scrape-prometheus-metrics/

https://github.com/influxdata/telegraf/blob/release-1.30/plugins/inputs/prometheus/README.md

# cat prom_blocky.conf
[[inputs.prometheus]]
  urls = ["http://blocky.somewhere.io:9110"]
# sudo -u telegraf telegraf --test --config prom_blocky.conf

2024-04-19T19:29:43Z I! Loading config: prom_blocky.conf
2024-04-19T19:29:43Z I! Starting Telegraf 1.30.1 brought to you by InfluxData the makers of InfluxDB
2024-04-19T19:29:43Z I! Available plugins: 233 inputs, 9 aggregators, 31 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-04-19T19:29:43Z I! Loaded inputs: prometheus
2024-04-19T19:29:43Z I! Loaded aggregators:
2024-04-19T19:29:43Z I! Loaded processors:
2024-04-19T19:29:43Z I! Loaded secretstores:
2024-04-19T19:29:43Z W! Outputs are not used in testing mode!
2024-04-19T19:29:43Z I! Tags enabled: host=sfo001
2024-04-19T19:29:43Z I! [inputs.prometheus] Using the label selector:  and field selector:
> blocky_blacklist_cache,group=ads,host=florax,url=http://blocky.somewhere.io:9110/metrics gauge=183177 1713554984000000000
> blocky_blocking_enabled,host=florax,url=http://blocky.somewhere.io:9110/metrics gauge=1 1713554984000000000
> blocky_build_info,build_time=20240106-205224,host=florax,url=http://blocky.somewhere.io:9110/metrics,version=v0.23 gauge=1 1713554984000000000
[ ... ]

I dont want to derail this issue too much, but yes your correct, but this doesnt actually solve the issue of getting query logs into influx.

Initially I actually did the same thing as you mentioned just to see what it looked like and if it would work(which it does), but I dont have enough grafana knowledge to adjust the blocky dashboard to get it working with telegraf instead of Prometheus

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 19, 2024

Maybe we should externalize the query log into a separate service via gRPC? We can define streaming contract and provide implementation for postgres/mariadb/csv. People can create own implementation, since grpc is platform agnostic, it could be done in different language (for example java)?

I actually like the idea since the querylog is one of the main external interfaces it would enable alternative storage solutions like InfluxDB without pollution of the blocky code itself.

@ThinkChaos
Copy link
Collaborator

Re InfluxDB having one client per version:
I think we can just pick the latest that Grafana supports since anyways the point is to use them together.
Based on this InfluxDB blog post that seems to be 3.0 since around Jan 22, 2024.

Also I'm not against adding some kind of RPC/websocket for query logging, though I think implementing a new backend out of blocky might be more work than inside since you're starting from scratch.

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 20, 2024

Generating a client from proto files(grpc) is fairly easy and can be done in most modern languages with the help of auto generated wrappers. 😉

I would have been interested in InfluxDB if the support of an implementation would have covered older backbends as well.
By only supporting the latest version it's not worth it in my opinion.

A similar behavior(entry TTL and grafana support) could be achieved with redis where we already have a client(that only needs a minor extension to store query logs). 🤷‍♂️

@starsoccer
Copy link
Author

Generating a client from proto files(grpc) is fairly easy and can be done in most modern languages with the help of auto generated wrappers. 😉

I would have been interested in InfluxDB if the support of an implementation would have covered older backbends as well.
By only supporting the latest version it's not worth it in my opinion.

A similar behavior(entry TTL and grafana support) could be achieved with redis where we already have a client(that only needs a minor extension to store query logs). 🤷‍♂️

+1 I don't think supporting just the latest influxdb would be ideal. To reiterate my last comment I think just writing the query log to a file in near real time would be sufficient and is the most flexible as anyone can parse a file. This could even be a CSV still just faster then once a day.

The issue currently is really just that there is no way to get near realtime query log without running a database

@ThinkChaos
Copy link
Collaborator

Alright then I think we should:

  1. Rename/refocus this issue to be about adding a way to stream the query log from another process, and maybe over the network.
    I'll let you do that if you agree.
  2. Defer anything InfluxDB related to when/if we get other requests about a specific version

About the query logging:

I'm not sold on gRPC + protobuf: I think HTTP Server Sent Events (SSE) + JSON would be easier to use for clients since they're even more ubiquitous, don't require a protocol build step, and most important IMO, we already use them and could make this "just" another service after #1427.
With HTTP that service could be merged/multiplexed on a single listener, which I don't think is possible with gRPC based on grpc.Server docs (though I didn't look hard TBH). So we could expose it as a new HTTP route that can be hosted on the same HTTP server as the other HTTP endpoints and benefit from the same certs for HTTPS, or same reverse proxy, etc. And support having it split once that's fully implemented.
Proof of the low entry bar, you could even use shell for basic analysis/log tailing:
curl --no-buffer https://blocky/whatever | jq (or nc -U /run/blocky/querylog.sock | jq if we add Unix socket support)

The JSON format could be something very similar to what we use for structured logging in querylog.LoggerWriter, or maybe even the same if we change client_names there to be a native array instead of turning it into a string.

@starsoccer starsoccer changed the title Feature: Query Logging to InfluxDB Feature: Realtime Query Logging without Database Apr 20, 2024
@starsoccer
Copy link
Author

I've updated the original post to now be having a realtime query log without database. Apologies for originally tying the request too tightly with influx.

While having a way to do this over the network would be useful I still think the lowest hanging option here is really just a near real time CSV file.

Another alternative could be just writing to a remote syslog if we want something that can do it over the network

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 20, 2024

Another alternative could be just writing to a remote syslog if we want something that can do it over the network

Wouldn't that be already possible if you enable querylog to console and pipe the binary output to a remote syslog target? 🤔

@ThinkChaos
Copy link
Collaborator

No worries, I think it's better for everyone to bring up multiple possibilities :)

Yeah I think Unix socket would be nicer to avoid going through actual storage, especially if you're not trying to save the data.

I think the existing CSV option is actually already near real time: the "one per day" in the docs means the file name is suffixed with the date.
I didn't catch that earlier but I think it's already mostly what you want, though the "one per day" might require logic on the "client" side to detect when the file switches...
That might at least be good enough to give it a try now.

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 20, 2024

I'm not sold on gRPC + protobuf: I think HTTP Server Sent Events (SSE) + JSON would be easier to use for clients since they're even more ubiquitous, don't require a protocol build step, and most important IMO, we already use them and could make this "just" another service after #1427.

I'm not familiar with SSE to be honest. 🫣
It's ok for me if it's easier to use and implement. 👍

@starsoccer
Copy link
Author

Another alternative could be just writing to a remote syslog if we want something that can do it over the network

Wouldn't that be already possible if you enable querylog to console and pipe the binary output to a remote syslog target? 🤔

Hmm not sure. How would I do that exactly in a docker container?

@starsoccer
Copy link
Author

starsoccer commented Apr 20, 2024

No worries, I think it's better for everyone to bring up multiple possibilities :)

Yeah I think Unix socket would be nicer to avoid going through actual storage, especially if you're not trying to save the data.

I think the existing CSV option is actually already near real time: the "one per day" in the docs means the file name is suffixed with the date.
I didn't catch that earlier but I think it's already mostly what you want, though the "one per day" might require logic on the "client" side to detect when the file switches...
That might at least be good enough to give it a try now.

I have no issue with a Unix socket but I assume that whatever reads that socket would need to be in the same docker container?

Oh yeah the CSV file confused me. Sounds like it could work then. The question is just how to handle the file name changing. I'll look into that with telegraf otherwise maybe we just need an option that simple appends to the same file rather then rotating

Edit: It does seem that telegraf has a way to read files with a glob. So it may be possible to just say read any files in a folder. But I'll have to test and see what happens

@ThinkChaos
Copy link
Collaborator

I'm not familiar with SSE to be honest. 🫣

It's basically just a HTTP request that the server never closes and writes data by chunks. It's an alternative to WebSockets for when you only need events going one way. The nice thing is it's compatible with lots of clients out of the box because the protocol is basic HTTP.
And implementing it on the server side is also pretty easy: write the event data with ResponseWrite.Write, call Flush, and repeat.

Hmm not sure. How would I do that exactly in a docker container?

Docker has options for how to collect logs, and I'm sure there's a way to send the container to something else.

I have no issue with a Unix socket but I assume that whatever reads that socket would need to be in the same docker container?

Since it's a file, you can put it in a volume that both the host and container have access to!
You can even mount that same host dir into another container. That can be used for example to expose a service via a reverse proxy that connect to the service via Unix socket. I like that kind of pattern because file system permissions are easier than firewalls.

@0xERR0R
Copy link
Owner

0xERR0R commented Apr 20, 2024

Imho, sse is good for webbrowser - server communication ang gRPC is more universal. Also from the performance point of view, gRPC should be better (http2,multiplexing,binary message format).

@ThinkChaos
Copy link
Collaborator

I don't know if the performance really matters for a querylog stream, but SSE works fine with HTTP 2 and even 3. You can also multiplex other requests over the same connection if your client supports it.
I think the only potential performance difference between would be binary messages, but even then JSON compresses pretty well.
Anyways the main reason I think SSE would be nice for such an API of SSE is the simplicity for both client and server.

In the sense of what you were proposing to have gRPC API for blocky to basically have plugins, I'm not necessarily against that ant think it makes more sense since you'll likely need calls both ways.

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 20, 2024

In theory we could even let the user configure it by switching between Json & messagepack (for example). Both can be serialized from structs pretty easily and this would enable optional binary compression by sacrificing readability.🤔

@kwitsch
Copy link
Collaborator

kwitsch commented Apr 20, 2024

To summarize my point of view: I like gRPC because of its clear structure and two way communication but find the idea tempting to let those logs stream through curl(especially for debugging purposes). 😅

@ruben-sprengel
Copy link

Sending the logs & querry logs to a remote syslog target would be awsome, so i could ship them e.g. to my Grafana Loki instance and could have the logs aswell in a Grafana Dashboard.
Another option could be to have a argument in the config file to enable to store the logs into a .log file and then expose this path to the local host. Then Promtail could maybe scrape this path and ship logs to Loki.

I am running blocky on a ARM host, so currently the Grafana Loki docker driver plugin is still not supported (only linux/amd64 ) to ship logs directly from a container to Grafana Loki :(

@starsoccer
Copy link
Author

So I tried playing a bit with telegraf and the csv in order to get telegraf to send the csv data into influx. And while I can get to push the data in Im struggling with actually being able to query on the data and get useful metrics out of it. Hopefully somewhere here who is a bit more familiar with telegraf can point in the right direction and others can also make use of it.

Currently this is what I have:

[[inputs.file]]
  files = ["/blocky-logs/*.log"]
  data_format = "csv"
  # Not sure if the order of the columns is correct or not
  csv_column_names = ["time", "ip", "clientName", "duration", "responseReason", "question", "responseAnswer"]
  # My influx knowledge isnt great either so Im not clear on what Id want as a tag vs what Id want as a field
  csv_tag_columns = ["ip", clientName", "question"]
  csv_timestamp_column = "time"
  csv_timestamp_format = "2006-01-02 15:04:05"  
  csv_delimiter = "\t"

[[outputs.influxdb]]
    urls = ["INFLUX URL HERE"]
    database = "DATABASE HERE"
    username = "USERNAME HERE"
    password = "PASSWORD HERE"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants