Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(framework-core): given files old content and new content, compute the valid hunks #86

Merged
merged 4 commits into from Aug 20, 2020

Conversation

TomKristie
Copy link
Contributor

@TomKristie TomKristie commented Aug 19, 2020

Given a set of files with old content and new content, and the set of valid hunks from the remote PR, generate the hunk for each file given from the user and filter out invalid hunks and invalid files.

I created sub directories to try to add some logical grouping to the user-input to hunk/patch handling logic. This way I can distinguish between handling the user's input, and handling GitHub PR data.
It is likely that I will need to do a follow-up pr to move the github and regex logic into a submodule.

Overview of the important methods:
the src/index.ts will invoke the comment method in src/github-handler/comment-handler/index.ts.
comment gets the github pull request scopes by getPullRequestScope, and parses the user's raw changes to patches by getSuggestionPatches, and creates a pr with those patches (TODO).

This PR focuses on getSuggestionPatches:

  1. get the hunks and filter invalid hunks by getValidSuggestionHunks. The two main submethods to filter invalid hunks are filterOutOfScopeFiles and filterOutOfScopeHunks
  2. get the patches - TODO

Towards #59

@TomKristie TomKristie requested review from chingor13, bcoe and a team August 19, 2020 16:02
@TomKristie TomKristie requested a review from a team as a code owner August 19, 2020 16:02
@codecov
Copy link

codecov bot commented Aug 19, 2020

Codecov Report

❗ No coverage uploaded for pull request base (comment-pr@29f5136). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff              @@
##             comment-pr      #86   +/-   ##
=============================================
  Coverage              ?   87.88%           
=============================================
  Files                 ?       20           
  Lines                 ?     1874           
  Branches              ?      126           
=============================================
  Hits                  ?     1647           
  Misses                ?      226           
  Partials              ?        1           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29f5136...0111d88. Read the comment docs.

readonly oldStart: number;
readonly oldEnd: number;
readonly newStart: number;
readonly newEnd: number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to include the new content here. It would simplify the algorithm:

GitHub patch -> Hunk[] -> filter based on valid ranges/file size -> convert hunk to comments -> create PR review w/ comments

You could skip having to go back and look up the content.
Maybe this is deferred to a later PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this object when I am getting the hunks. The hunk output from the diff library doesn't produce raw text. It produces a patch text with "+" and "-" prefixing certain lines and also the comment "no new line and end of file" if there is no newline and the end of the file. Therefore I'd need to alter the patch text at this step. So either way I'd need to do a raw text lookup to avoid finicky string manipulation. Therefore I have 2 choices: get and store raw text before I filter the hunks or after. If I do it after the filtration, I save on memory, which could be valuable if some strings are particularly large.

Comment on lines +28 to +29
rawContent: RawContent,
fileName: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could RawContent also track its fileName? It simplifies this method conceptually that we're converting from one RawContent change to a list of Hunks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the user's input is Map<file name, RawContent> and we'd also like to keep it as a map so that we can look up the ranges of what text to apply afterwards (since the diff library text output is in patch format and modifies the original text). Therefore I could map each raw content to a RawContent + file name object. Yup it's doable, but is that object store format a better decision?

Comment on lines +48 to +51
function getValidSuggestionHunks(
scope: Range[],
suggestedHunks: Hunk[]
): {inScopeHunks: Hunk[]; outOfScopeHunks: Hunk[]} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both of these are already sorted, we should be able to do this in O(n + m) time rather than O(n log m) or O(m log n).

Depending on the expected sizes of the inputs, we might want to switch.

This is non-blocking for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be really interesting to tackle as there'd probably need to be a lot of testing to see what the average case is for users.
#87

@TomKristie TomKristie merged commit bce5ef5 into googleapis:comment-pr Aug 20, 2020
chingor13 added a commit that referenced this pull request Sep 25, 2020
…n pull requests (#105)

* feat(patch text to hunk bounds): support regex for patch texts (#83)

* fix(patch text to hunk bounds): support regex for patch texts

* more comments and more tests

* fix(framework-core): core-library get remote patch ranges (#84)

* fix(framework-core): given files old content and new content, compute the valid hunks (#86)

* fix(framework-core): parse raw changes to ranges

* refactor(framework-core): rename modules, functions, & re-org project structure (#89)

* fix(framework-core): hunk to patch object (#91)

* feat: build failure message from invalid hunks (#90)

* test: add failing stub and test for building the failure message

* fix: implement message building

* fix: use original line numbers in error message

* docs: add docstring

* docs: add note about empty input returning empty string

* feat(framework-core): comment on prs given suggestions (#93)

* feat(framework-core): main interface for create review on a pull request (#114)

* feat(framework-core): main interface for create review on a pull request

* docs: fix typo

* nits and typos...

* gts lint warning fix

* fix(framework-core): combine review comments (#116)

* fix(framework-core): collapsing timeline and inline comments into single review

* test: fixed imports

* added case when there are out of scope suggestions and no valid suggestions

* feat(framework-core): return review number and variable renaming (#117)

* feat(framework-core): return review number and variable renaming

* lint

Co-authored-by: Jeff Ching <chingor@google.com>
Co-authored-by: Justin Beckwith <justin.beckwith@gmail.com>
Co-authored-by: Benjamin E. Coe <bencoe@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants