Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to manage versioning and generation of document IDs in the index separately #665

Open
ermakvlas opened this issue Nov 25, 2022 · 0 comments

Comments

@ermakvlas
Copy link

I want to use the key of the message in the topic as the id, but at the same time use the internal versioning of the elasticsearch. Now this is impossible.

Currently property key.ignore determines the mapping of the document id in index

final String id = config.shouldIgnoreKey(record.topic())
? String.format("%s+%d+%d", record.topic(), record.kafkaPartition(), record.kafkaOffset())
: convertKey(record.keySchema(), record.key());

and versioning type at the same time
if (!config.isDataStream() && !config.shouldIgnoreKey(record.topic())) {
request.versionType(VersionType.EXTERNAL);
request.version(record.kafkaOffset());
}

By setting the value of property key.ignore=false I refuse to use offset when generating the id. However, at the same time, it is forced to use external versioning, which without options uses offset as the version number. It looks illogical, maybe there are some reasons for this behavior?
Forcing the use of offset for versioning can be undesirable in cases where there is a possibility of resetting the offset on the topic that the connector reads. In such a situation, new versions of existing documents will have a smaller version (offset will reset to 0) and the update will fail. Further correct updating of documents in this case will require a complete re-indexing of the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant