Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support including -type value in the data points returned by read #65

Open
dmehra opened this issue Jan 29, 2016 · 3 comments
Open

Support including -type value in the data points returned by read #65

dmehra opened this issue Jan 29, 2016 · 3 comments

Comments

@dmehra
Copy link
Contributor

dmehra commented Jan 29, 2016

When reading from ES with a schema using document type, we support specifying read elastic -type 'x,y' but the values x / y do not get surfaced in the data points that come back from the read. Therefore, the user cannot reduce, sort or otherwise group by type, or display the type.

Let's remove this limitation by including the value of document type in the points, by default in a field named type. If the data already contains another field named type, its value will be overwritten. To allow the user to resolve such name conflicts, support an optional parameter to the read called something like -typeField <name> and place the value of document type into that field.

If -type is not specified as a read parameter, but document type is used in the data, we should still pick up its value and put into typeField (by default, type).

Note: the default name could also be _type. We have one user vote for calling it type.

Usage example:

read elastic -from start -to end -index 'mine' -type 'tag1,tag2,tag3'

should return back points that have field type with value of tag1 or tag2 or tag3.

read elastic -from start -to end -index 'yours' -type 'tag4,tag5' -typeField 'tagged'

should return back points that have field tagged with value of tag4 or tag5, and field type with its original contents, if present on the points.

@demmer
Copy link
Contributor

demmer commented Feb 1, 2016

I think we should make -typeField optional. The type isn't really a field in the document so in my mind it doesn't make a lot of sense to add it to the points by default. Also it can't be filtered or used in aggregations like a normal field would. In cases where users want this behavior, my recommendation would be to also include the type as a normal field in the points. I believe other elasticsearch query systems like Kibana will have the same restriction (though I'm not sure about that).

If -type is not specified as a read parameter, but document type is used in the data, we should still pick up its value and put into typeField (by default, type).

I also don't think this kind of magic is a good idea. While the intent to improve ease of use is admirable, hoisting magic fields from one place in the flowgraph to control options in a read earlier in the flowgraph seems like it's bound to cause confusion, especially if the user happens to have a field named type in the data points themselves.

@dmehra
Copy link
Contributor Author

dmehra commented Feb 1, 2016

hoisting magic fields from one place in the flowgraph to control options in a read earlier in the flowgraph

That's not what was meant. "if document type is used in the data" here means that your ES schema is using document type, not that the downstream Juttle is making use of it.

it can't be filtered or used in aggregations like a normal field would

why not?

we should make -typeField optional

We can certainly do that. But if the resulting type field in the juttle stream can't be used for aggregations, that would make it nearly useless...

@davidvgalbraith
Copy link
Collaborator

Putting this one on ice. @dmehra herself just called it "nearly useless" since we can't optimize it and no user has need of it. Let's stick to recommending users put the type field in the actual data if they want that value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants