New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial commit for setting up a new component: k8slog receiver #24439
base: main
Are you sure you want to change the base?
Conversation
4858200
to
3e05fa1
Compare
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
remove stale |
3e05fa1
to
ea3a841
Compare
951c26c
to
04d4b40
Compare
) | ||
|
||
// Config is the configuration of a k8s input operator | ||
type Config struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of the configuration options required for the initial implementation? Can you remove them for now and add them along with the functionality using them going forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, starting small will make things easier to review and get started and we can always add them in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
receiver/k8slogreceiver/config.go
Outdated
|
||
// Expr represents the rules to filter containers based on a custom expression. | ||
// TODO: define the values passed to expr. | ||
Expr string `mapstructure:"expr"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not introduce expr
filtering to start. The other filter options you've provided are quite extensive. For expression language filtering I'd prefer we use OTTL.
stability: | ||
development: [logs] | ||
codeowners: | ||
active: [h0cheung] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add me here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will do it soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @TylerHelmuth @h0cheung , I will be more than happy to help reviewing/testing and maintaining this component. In this, feel free to add me into the list as well if that's ok already.
) | ||
|
||
// Config is the configuration of a k8s input operator | ||
type Config struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, starting small will make things easier to review and get started and we can always add them in the future.
04d4b40
to
6eb08af
Compare
82afa36
to
5fde154
Compare
@TylerHelmuth @dmitryax I've simplified the configs and added Tyler Helmuth to the code owner. Could you please have a look at this? |
receiver/k8slogreceiver/config.go
Outdated
// Env represents the rules to extract from container environment variables. | ||
Env []FieldExtractConfig `mapstructure:"env"` | ||
|
||
// OtelEnv represents the rules to extract from environment variables of otel itself. | ||
OtelEnv []FieldExtractConfig `mapstructure:"otel_env"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should allow extracting from env variables, that feels like a security risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OtelEnv
has been removed.
However, Env
is needed in some cases. For example, if there are some environment variables in the docker image, there is no other way to get it.
Is this really risky? resourcedetection
supports it, and this component gets k8s node name from environment variables, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an env var to filter feels ok. But this section is for taking fields from the pod and adding them as resource attributes to the log right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting attributes from environment variables is useful, and many components support it such as resoucedetection
Are there any reports of security risks associated with reading environment variables as strings and adding them to data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The risk is exposing apikey/password information, but that risk is lowered if the config only allows static strings. If we do allow this I think we can do all env vars with 1 config option, no need to split out OTel env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, the env
config is empty, and no variable will be exposed. When a user set this, he should know what he is doing.
It seems that otel_env
is necessary, so I've removed it in the latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmitryax do you have an opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any updates for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets move forward with this allowed as we've discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also find it useful to have the option to collect ENV_VAR information from the Pods.
That would give us important information about OTEL specific info like OTEL_SERVICE_NAME
.
Having service.name
populated in log records collected by the k8slog receiver is quite important since it will allow us to correlate logs with traces using the service.name
value.
132b260
to
8256f32
Compare
f7a8980
to
a912135
Compare
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@TylerHelmuth fyi is it a good time to review? I'll try my hand at it if possible. |
Two modes of discovery are planned to be supported in the future: | ||
|
||
- `daemonset-file`: Deployed as a DaemonSet, the receiver will read logs from files inside pods in the same node. | ||
- `sidecar`: Deployed as a sidecar container, the receiver will read logs from files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not blocking, but it'd be nice to have issues to track those?
|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| `op` | The operation to perform. Options are: <br> - "equals": (default) the value must be equal to the specified value. <br>- "not-equals": the value must not be equal to the specified value. <br> - "exists": the value must exist. <br> - "not-exists": the value must not exist. <br> - "matches": the value must match the specified regular expression. <br> - "not-matches": the value must not match the specified regular expression. | | ||
| `key` | The key of the map. | | ||
| `value` | The value to match. Only used for "equals", "not-equals", "matches", and "not-matches" operations. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this right away?
Note other components also have built their own filters, such as k8sobjects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filter is similar to the k8sattributes
processor. Many users who want to use this component may be familiar with k8sattributes since it was the previous solution for correlating k8s metadata to logs. This may reduce their migration costs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first PR it looks like this can land. You could cut down a bit on the filters to bring them back later with more testing, but it's ok to mature them in as well.
954c61e
to
60fc7e4
Compare
60fc7e4
to
b85c330
Compare
| `extract` | | The rules to extract metadata from pods and containers. TODO default values. | | ||
| TODO: add fields for reading files similar to filelogreceiver | | ||
|
||
When `discovery.mode` is not `sidecar`, there are additional configuration options: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to mention what functionalities users lose by not being able set these options.
|
||
The first one is caculated by the graph driver, and the second one is from the mount points. | ||
|
||
Also, the host root is mounted under /host-root on the otel container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the otel container need any special permissions in order to be able to share mounts with the incoming target Pods on the fly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It connect to the runtime API.
// NodeFromEnv represents the environment variable which contains the node name. | ||
NodeFromEnv string `mapstructure:"node_from_env"` | ||
|
||
// HostRoot represents the path which is used to mount the host's root filesystem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we need the whole host's root directory to be mounted. Can't we just focus on /var/log/pods
by default for the stdout
mode? I guess many users would have concerns if their whole host directory is mounted by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But the files under /var/log/pods
may be symbol links to paths defined by runtimes, which can also be configured by ops. If they don't mount the whole host's root directory, they must have knowledge about will the files actually live. This may be difficult for users, especially with k8s clusters that mix nodes with different runtimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that be better if we chose a reasonable list of directories like /var/log/pods/*
, /var/lib/docker/*
etc in order to specifically define the possible directories the known runtimes use?
We can even make this list configurable so as advanced users could set their own extra directories if the want.
Mounting the whole host's root directory looks problematic to me and maybe shouldn't be the default.
In GKE Autopilot for example this would not be supported: grafana/loki#9100.
So my suggestion would be to start with reasonable defaults, while also giving the option for more advanced users to extend these defaults.
``` | ||
|
||
The `Source` will get pod list and metadata from k8s api (maybe there will also be an implementation using kubelet). | ||
Then, it will get the logPath and env from docker/cri-containerd api. And they will be associated with the pod metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just by-pass the extra API call to the runtime and construct the log path based on a default pattern which is more or less known? i.e. /var/log/pods/'<namespace>_<pod_name>_<pod_uid>'/<container_name>/*.log
, where namespace
, pod_name
, pod_uid
and container_name
are already known from the k8s API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also works. Datadog is using this way. But:
- The path is different across k8s versions. Maybe it will also change in future. see https://github.com/DataDog/datadog-agent/blob/0d3dcb9fdb96e8a9d1810bb041d2dae6250372f8/pkg/logs/launchers/container/tailerfactory/file.go#L239
- Some runtime API will directly return the actual path, which reduces resolving symbol links.
- If we bypass extra API, it will be complex to get environment variables defined by downward api.
Co-authored-by: Chris Mark <chrismarkou92@gmail.com>
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
remove stale |
Description:
k8slog receiver provides full functionality to collect logs in k8s containers. It is an all-in-one solution, which supports to:
If we want to collect logs from k8s, we can use this component.
Link to tracking Issue:
#23339
Testing:
Tests for unmarshaling configuration have been added.
Documentation:
Docs for configurations and overall design has been added.