Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requesting a Splunk-specific layout-class for log4j2 #200

Open
UnitedMarsupials opened this issue Sep 24, 2021 · 9 comments
Open

Requesting a Splunk-specific layout-class for log4j2 #200

UnitedMarsupials opened this issue Sep 24, 2021 · 9 comments

Comments

@UnitedMarsupials
Copy link

UnitedMarsupials commented Sep 24, 2021

By default -- without any layout specified -- the SplunkHttp-appender will simply log the textual message itself, along with a few other standard fields.

To get more details -- and even, optionally, insert additional custom fields -- one would use the JsonLayout. For example:

		<SplunkHttp name="SplunkDetailed"
		    url="https://splunk-hec-app1:8088"
		    index="idx"
		    host="${hostname}"
		    sourcetype="log4j2"
		    source="QA:${nickname}:${alias}"
		    batch_size_count="17"
		    batch_interval="3"
		    disableCertificateValidation="true"
		    token="7d8xxx25b">
			<JsonLayout
			    compact="true"
			    eventEol="true"
			    propertiesAsList="true"
			    includeStacktrace="true"
			    locationInfo="true"
			    objectMessageAsJsonObject="true"/>
				<KeyValuePair key="applicationInstance" value="QA"/>
			</JsonLayout>
		</SplunkHttp>

This works Ok, but results in useless duplications... For example, here is one event logged using above configuration:

{
  "severity": "WARN",
  "logger": "our.deeply.buried.class.MessageHandler",
  "time": "1632494387.518",
  "thread": "pool-15-thread-2",
  "message": {
    "thread": "pool-15-thread-2",
    "level": "WARN",
    "loggerName": "our.deeply.buried.class.MessageHandler",
    "message": "Our verbose message",
    "endOfBatch": false,
    "loggerFqcn": "org.slf4j.impl.Log4jLoggerAdapter",
    "instant": {
      "epochSecond": 1632494387,
      "nanoOfSecond": 518362000
    },
    "applicationInstance": "QA",
    "contextMap": [],
    "threadId": 154,
    "threadPriority": 5,
    "source": {
      "class": "our.deeply.buried.class.MessageHandler",
      "method": "createImportPreconditionsMessage",
      "file": "MessageHandler.java",
      "line": 91
    }
  }
}

As you can see, there are problems:

  1. message.level is the same thing as severity.
  2. message.thread is a copy of thread.
  3. logger is a copy of message.loggerName.
  4. message.instant duplicates time.
  5. endOfBatch is quite useless, but cannot be suppressed.
  6. message.contextMap is always included even when empty.
  7. The useful additional information is all "hidden" in the message dictionary -- including the actual message.message text. It'd be nicer, if the fields were at the top level: threadPriority instead of message.threadPriority.

One cannot blame the stock JsonLayout class for this redundancy, because it does not know, that the calling appender (SplunkHttp) is adding the information too. One's only way to make the events appear more sensible, currently, is to implement special manipulations on the Splunk-server side, based on the sourceType.

Hence a request for a custom SplunkLayout to format the messages as would make sense for Splunk in particular.

@bparmar-splunk
Copy link
Contributor

Hi @UnitedMarsupials,
Thanks for reaching out !

Based on your suggestion, if the resolution of this issue can be at Splunk Server side, then it is not in our scope.
If you are facing any issue within the scope of using this library, then we would like to help you with the resolution.

Please let us know in case of further queries.
We are closing this ticket for now.

Thank you !

@UnitedMarsupials
Copy link
Author

Based on your suggestion, if the resolution of this issue can be at Splunk Server side, then it is not in our scope.

Unfortunately, this turned out to be untrue. At least, according to the Splunk-administrators at our organization, such massaging of the incoming JSON is not possible.

Please, reopen this ticket and implement the Splunk-specific layout-class as requested -- to both avoid data-duplication and allow for easy addition of any custom-data to each event... Thank you!

@bparmar-splunk
Copy link
Contributor

Hi @UnitedMarsupials,
Thank you for your reply.

If this issue does not belong to Splunk server side then duplicate entries are created in serialising, it seems.
Will you please provide your inputs on the changes which avoids creating duplicate values ?

@UnitedMarsupials
Copy link
Author

Will you please provide your inputs on the changes which avoids creating duplicate values?

Sorry, I don't understand the question: are you asking me for a patch, or an example of the current problem?
The latter is provided in my original post -- when using the JsonLayout.

For a patch, I'd need to write the new class myself -- from scratch, for there is nothing Splunk-specific in place right now. I might get around to it some day, but I don't have any such code ready...

@mbukowski-splunk
Copy link

@UnitedMarsupials I think we don't have a full understanding of the problem here and not sure how your initial post is related to logging lib and what kind of fix you expect. Well described scenario or any env shared could help.

@UnitedMarsupials
Copy link
Author

UnitedMarsupials commented Aug 25, 2022

@UnitedMarsupials I think we don't have a full understanding of the problem here

I thought my original submission would make the problem obvious, but I'll try to explain again... I understand, that this project seeks to cover all logging API implementations -- maybe, you can summon someone with the deeper knowledge of the log4j2 in particular to help understand, what I'm talking about?

When creating the text of each event-message, log4j2 uses the layout class specified in the configuration file. These layouts are many -- and users can implement their own.

It is the choice of the layout, that determines the format used by each event -- it can be a CSV-row, a single long line, an XML- or a YAML-blob, an SQL insert-statement -- anything. The layout is used to form the message -- which the configured appender then sends to a particular destination (be it a file, or a remote SQL-, syslogd-, Splunk- or any other server).

In my original submission, I provided a live example of the log4j2 configuration we use here to send events to a Splunk-server -- using the JsonLayout bundled with the log4j2 distribution (org.apache.logging.log4j.core.layout.JsonLayout). I also provided an example of the data-duplication caused by our using this layout.

Our configuration makes your code -- the appender called SplunkHttp -- take the JSON text formed by the Apache-provided JsonLayout, and stuff it (as a field called message) into a bigger JSON text, which is then actually sent to the Splunk-server.

My request in this ticket is for a new class com.splunk.log4j.SplunkLayout (or whatever), custom-tailored for events meant for Splunk-servers -- which could be used instead of the Apache-provided JsonLayout in order to avoid (some of) the problems I enumerated 11 months ago.

@bparmar-splunk
Copy link
Contributor

@UnitedMarsupials,
Thank you for providing detail info.
For now, we have modified the parsing logic when Json layout is used. Please check this commit.
It will not duplicate the values which you have encountered earlier. If any layout other than PatternLayout is used, it will just consider the formatted message as event and it will not wrap that message further.

As far as SplunkLayout (custom class) is concerned, we can create that class but again we have to use Json for formatting event messages along with few metadata fields.
Please provide your inputs on this.

@pablocoberly
Copy link

pablocoberly commented Nov 15, 2022

@bparmar-splunk Did this get merged in? I have this exact problem. Doesn't look like it. Would be great as it sends more data than necessary to Splunk and creates messy and thus confusing log entries. One thing I did was exclude the MDC:

<includeMDC>false</includeMDC>

Which helps a bit, but the whole json object e.g. "message.message" isn't ideal.

@artem-emelin
Copy link

artem-emelin commented Nov 22, 2022

Actually played a little bit with Splunk and there is way to format event output by implementing custom: com.splunk.logging.EventBodySerializer
Eg.:

<appender name="SPLUNK" class="com.splunk.logging.HttpEventCollectorLogbackAppender">
            <url>${splunk.http.url}</url>
            <token>${splunk.token}</token>
            <index>${splunk.index-name}</index>
            <eventBodySerializer>com.custom.CustomSplunkEventBodySerializer</eventBodySerializer>
            <layout class="net.logstash.logback.layout.LogstashLayout">
                <includeMdc>true</includeMdc>
                <timestampPattern>yyyy-MM-dd HH:mm:ss.SSS</timestampPattern>
                <timeZone>UTC</timeZone>               
            </layout>
</appender>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants