Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] remove extra system headers on direct gets #4618

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

aricart
Copy link
Member

@aricart aricart commented Oct 3, 2023

[FIX] direct get APIs can contain duplicate Nats-Stream, Nats-Sequence, Nats-Time-Stamp and Nats-Subject headers because it simply appends JSON bytes to existing headers. If the message was onboarded on a republish, this will contain the above system headers. Since headers are not ordered and can contain multiple values for the same header, this can break KV clients and create ambiguity on the stream that yielded the message.

  • Link to issue, e.g. Resolves #NNN
  • Documentation added (if applicable)
  • Tests added
  • Branch rebased on top of current main (git pull --rebase origin main)
  • Changes squashed to a single commit (described here)
  • Build is green in Travis CI
  • You have certified that the contribution is your original work and that you license the work to the project under the Apache 2 license

Resolves #4573

Changes proposed in this pull request:

  • remove extra Nats-Stream, Nats-Sequence, Nats-Time-Stamp and Nats-Subject that may be present on the stream message when using direct get apis

@aricart aricart requested a review from a team as a code owner October 3, 2023 15:05
server/jetstream_test.go Outdated Show resolved Hide resolved
Copy link
Member

@wallyqs wallyqs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the performance impact on direct get performance due to these?

Should we make one function that removes them all at once?

@aricart
Copy link
Member Author

aricart commented Oct 3, 2023

What is the performance impact on direct get performance due to these?

We would have to scan at least once, but a better question is perhaps we should be doing this on the on-board? The issue is republishing is causing a recapture by the stream. If the stream is guaranteed not to have the header, then no cleanup would be necessary.

@aricart aricart force-pushed the fix-4573 branch 2 times, most recently from 2916bb0 to 36b4e2d Compare October 3, 2023 18:26
@aricart
Copy link
Member Author

aricart commented Oct 3, 2023

@derekcollison discussed with @neilalexander an alternative strategy - remove these headers before onboarding into the stream. That keeps direct as fast as possible and prevents serialization of any Nats- header into the stream saving some space, and removing the need to check/erase/reset these headers downstream.

@aricart
Copy link
Member Author

aricart commented Oct 3, 2023

(and yes, I added one function to remove all in one shot)

@aricart
Copy link
Member Author

aricart commented Oct 3, 2023

Made a change to constrain the actual headers that are removed as some are handled further down during the Store() operation.

@derekcollison derekcollison self-requested a review October 3, 2023 22:37
@@ -4399,6 +4400,11 @@ func (mset *stream) processJetStreamMsg(subject, reply string, hdr, msg []byte,
}
}

// if we are NOT doing subject remapping, this is a message to on-board it shouldn't store Nats-* headers
if tsubj == _EMPTY_ {
hdr = removeStreamIdentityHeaders(hdr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, but big change since we have always stored the message and headers as we received them. Be good to discuss with @wallyqs and @bruth

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More importantly - this doesn't fix the issue for direct get - it simply solves it for the on-boarding. We would still need to have the initial fix on the outbound to ensure that clients won't get duplicate headers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes since they are stored with headers today..

@aricart aricart marked this pull request as draft October 4, 2023 01:51
… it simply appends JSON bytes to existing headers. If the message was onboarded on a republish, this will contain system headers. Since headers are not ordered and can contain multiple values for the same header, this can break KV clients as well as create ambiguity on the stream that yielded the message.

Fix #4573
@aricart aricart marked this pull request as ready for review October 4, 2023 11:21
@aricart
Copy link
Member Author

aricart commented Oct 4, 2023

@derekcollison this is possibly the correct fix, if we think that we want to reduce the surface area, we can keep the on-boarding of the the stream identity headers, but this is the best we can do. The direct API checks if there's a Nats- entry in the headers, and then attempts the removal.

@derekcollison
Copy link
Member

I am conflicted, I think we need to retain what the app sent us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[JS] direct get apis include old Nats- headers creating duplicates
4 participants