mapping-normalization-example

This repository is to discuss and demonstrate Elasticsearch mapping and analysis based approach to normalization of document fields. Namely the log level field.

Introduction

When collecting and indexing logs from distributed system into central search engine (like Elasticsearch) it is very important and useful to deploy data model (such as ViaQ/elasticsearch-templates). In context of logging, one of the most important document fields is the log level field. Every log record has it. The challenge is that every system that produces logs can use different log categories.

Assuming logs are collected by light-weight log collectors that ship the logs either to Elasticsearch directly ...

 +-----------------+     +-----------------+
 |  Log collector  |     |  Log collector  |
 +-----------------+     +-----------------+
          |                       |
          |                       |
          |                       |
          |  +-----------------+  |
          |  |                 |  |
          +->|  Elasticsearch  |<-+
             |                 |
             +-----------------+

... or they ship logs to one or more log aggregators first and then logs are sent to Elasticsearch.

 +-----------------+     +-----------------+
 |  Log collector  |     |  Log collector  |
 +-----------------+     +-----------------+
          |                       |
          |                       |         
          |  +-----------------+  | 
          +->| Logs aggregator |<-+
             +-----------------+                  
                      |
                      V
             +-----------------+
             |                 |
             |  Elasticsearch  |
             |                 |
             +-----------------+

The question is: Where the log level categories should be normalized to common scale unified by the data model? Basically, there are two options:

One option it to handle log level normalization in every Log collector or in Log aggregator.
Other option is to handle log level normalization in Elasticsearch during indexing.

The following text investigates the later option only.

Motivation

The main motivation to investigate the later option:

implement data normalizations as part of the data model
easy identification of missing rules or unexpected data transformation results
if data model is changed and/or if the data transformations need to be updated it should be easier to deploy new code and re-calculate historical data already stored in central place

Goals

provide implementation guidelines and examples
investigate pros and cons wrt:
- side-effects
- performance impact
- vendor lock-in

Content

Setup Elasticsearch mapping
Index and search sample data
Get documents including normalized values

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_configure_env.sh		_configure_env.sh
categories.json		categories.json
check.json		check.json
documents.json		documents.json
documents.md		documents.md
index_and_search.sh		index_and_search.sh
level.template.json		level.template.json
mapping.md		mapping.md
push_template.sh		push_template.sh
search.md		search.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

_configure_env.sh

_configure_env.sh

categories.json

categories.json

check.json

check.json

documents.json

documents.json

documents.md

documents.md

index_and_search.sh

index_and_search.sh

level.template.json

level.template.json

mapping.md

mapping.md

push_template.sh

push_template.sh

search.md

search.md

Repository files navigation

mapping-normalization-example

Introduction

Motivation

Goals

Content

About

Releases

Packages

Languages

License

lukas-vlcek/mapping-normalization-example

Folders and files

Latest commit

History

Repository files navigation

mapping-normalization-example

Introduction

Motivation

Goals

Content

About

Topics

Resources

License

Stars

Watchers

Forks

Languages