Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(integrations): Add scrubbing sensitive data integration #2422

Merged
merged 8 commits into from Feb 25, 2020

Conversation

ma2gedev
Copy link
Contributor

Add feature scrubbing sensitive data like sanitizeKeys option in Raven.js . And extract the feature as external integration.

Relates #2391

How to use

Pass sanitizeKeys through Scrub option parameter. Then mask event values if event data has keys that match the sanitizeKeys.
The code like the following:

Sentry.init({
  dsn: dsn,
  integrations: [
    new Sentry.Integrations.Scrub({
      sanitizeKeys: ['sensitive', /field/],
    }),
  ],
});

check before submitting

Before submitting a pull request, please take a look at our
Contributing guidelines and verify:

  • If you've added code that should be tested, please add tests.
  • Ensure your code lints and the test suite passes (yarn lint) & (yarn test).

Resources

}

/** JSDoc */
export class Scrub implements Integration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use a noun for the integration name.

Suggested change
export class Scrub implements Integration {
export class Scrubber implements Integration {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback! I will change later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed the changes to use noun. All the yarn build, lint and test are worked on my laptop. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, other integrations use verbs (RewriteFrames, Debug, Dedupe, CaptureConsole - rather than FrameRewriter, Deduper, etc), so Scrub actually fits better in that pattern, though you might consider something like ScrubData or SanitizeData, since other integrations include the object of the verb in the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clearify naming convention. ScrubData is much more good name I think, therefore I renamed the name at latest commit.

@kamilogorek
Copy link
Contributor

Looks good, however, be aware that normalization is already performed when sending an event. This integration will cause it to be done twice, which may have some small negative performance impact.
Also, are you aware that provide scrubbing functionality on the serverside already?
https://docs.sentry.io/data-management/sensitive-data/#server-side-scrubbing

cc @lobsterkatie

@ma2gedev
Copy link
Contributor Author

@kamilogorek Thank you for your response.

Looks good, however, be aware that normalization is already performed when sending an event. This integration will cause it to be done twice, which may have some small negative performance impact.

OMG! I missed it out. I will check the code and make a change.

Also, are you aware that provide scrubbing functionality on the serverside already?

Yes, I've already reviewed these documents (thanks for providing detailed documentation). Our project processes a lot of personal identifiable information therefore we would like to try to avoid sending these information to the Sentry server whenever possible.

@ma2gedev
Copy link
Contributor Author

Looks good, however, be aware that normalization is already performed when sending an event. This integration will cause it to be done twice, which may have some small negative performance impact.

I found the code to normalize events here:

return this._normalizeEvent(evt, normalizeDepth);

And it's called after all integrations were processed.
My first approach used normalize function to avoid circular references during event scrubbing.
But in the next plan, I am planning to ignore circular references when scrubbing events for avoiding multiple normalization processes.

Is this plan appropriate? If there is another good approach, please advice me.

- then change names from scrub to scrubber
- core calls `normalize` before send an event therefore calling
`normalize` may cause performance impact
- use memoization avoiding circular references
- rename from Scrubber to ScrubData
@ma2gedev
Copy link
Contributor Author

ma2gedev commented Feb 20, 2020

Latest commit 4ea777c got rid of normalize function to avoid performance impact. And the code base rebased with latest master.

@kamilogorek
Copy link
Contributor

LGMT! @lobsterkatie your call :)

Copy link
Member

@lobsterkatie lobsterkatie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - merging!

@lobsterkatie lobsterkatie changed the title feat: Add scrubbing sensitive data integration feat(integrations): Add scrubbing sensitive data integration Feb 25, 2020
@lobsterkatie lobsterkatie merged commit 9154a39 into getsentry:master Feb 25, 2020
@lobsterkatie
Copy link
Member

lobsterkatie commented Feb 25, 2020

(FYI - BrowserStack tests can't run on external PRs, but when I ran Travis on an internal branch containing this code, they passed just fine.)

@ma2gedev
Copy link
Contributor Author

Thank you for your cooperation! I'm looking forward the next release. ❤️

@ma2gedev ma2gedev deleted the feature/scrubbing-event branch February 26, 2020 00:16
@HazAT
Copy link
Member

HazAT commented Feb 26, 2020

Hey @ma2gedev

First of all, thanks for the work that you put into this PR/Integration.
While we think code and functionality-wise this integration is fine, upon further reflection we've decided that we will revert this PR and not ship it in the near future. Let me explain why.

We are currently working on a new product feature called Relay, which will effectively be a proxy running between the SDK and our servers, doing normalization, data scrubbing, and many other things. We will run it in front of our servers, but the plan is to also eventually make it available for folks to run client-side, which will allow us to have a consistent data scrubbing system across all SDKs. Because of this, and because the exact details of Relay's data scrubbing implementation haven't yet been finalized, I'm wary of introducing an integration which might do things differently and set precedents for how we approach data scrubbing in the future. If we add your integration now, we'll have to document it, people may come to rely on it, and all of a sudden releasing Relay becomes a breaking change we need to manage when it otherwise wouldn't have to be.

Where to go from here:

  • You can either use your fork of the repo and use the integration just as it is.
  • You could create an independent repo (e.g.: like https://github.com/getsentry/sentry-rrweb) and ship the integration under your name.

Sorry that I didn't write this earlier. I hope you understand that this is mostly just bad timing, as well as some miscommunication on our end, for which I apologize. In the end, though, we need to think about the whole product and make sure we're providing the best experience for all customers.

Cheers,
Daniel

@ma2gedev
Copy link
Contributor Author

Hi @HazAT

Thank you for explaining the background why PR was rejected.
I'm a little sad, but I'm ok and understand that problem is timing and miscommunication.

I'm not sure the new product detail but the feature that providing across all SDKs is interesting. I'm looking forward to new product feature.
Thank you! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants