Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#944 testcoverage #998

Merged
merged 16 commits into from
Jan 5, 2022
Merged

#944 testcoverage #998

merged 16 commits into from
Jan 5, 2022

Conversation

richarda23
Copy link
Contributor

@richarda23 richarda23 commented Nov 21, 2021

These are additional tests for biojava-core o.b.n.core.sequence package for issue #944 . A number of tests are annotated 'Disabled' as they suggest some bugs or unexpected behaviour. There are two sorts of issues - the biological behaviour, and the code structure.

These tests are not fully completed, but I'm raising this PR now so as to get feedback/suggestions on what to do. It might just be that I'm not using the classes properly

  1. Biological behaviour
  • the behaviour of coding sequences or exons defined on minus strand seems unexpected. E.g. TranscriptSequenceTest#getCDNASeqNegativeStrand and GeneSequenceTest#addIntronsUsingExonsNegativeStrand. Perhaps I'm using the classes wrongly but I've tried various combinations of start/end/strand behaviour and can't get meaningful results.
  • Some of the documentation is quite woolly and/or hard-to-follow. E.g for ExonComparator and TranscriptSequence#getProteinCDSSequences.
  1. Code behaviour
  • If a class is instantiated correctly from its constructor, I don't think calls to equals and hashcode should throw NPEs. Methods such as removeXXX() fail as removing an item from a collection calls equals() methods. This problem seems to be caused by not calling super constructor; however doing so would add a CompoundNotFoundException to the method signature
  • AccessionID needs to be set to avoid NPEs in several methods; perhaps this should be added to the constructor?
  • various getXXX () methods (e.g. in GeneSequence) return the underlying collection, that can be modified outside the class. Since GeneSequence maintains 2 collections in sync internally ( a Map and a List) , modifying the returned collection breaks the class invariants. Would suggest returning a immutable lists for these methods (since there are add/remove methods provided by the class).

@josemduarte
Copy link
Contributor

Some comments to the code behaviour:

If a class is instantiated correctly from its constructor, I don't think calls to equals and hashcode should throw NPEs. Methods such as removeXXX() fail as removing an item from a collection calls equals() methods. This problem seems to be caused by not calling super constructor; however doing so would add a CompoundNotFoundException to the method signature
AccessionID needs to be set to avoid NPEs in several methods; perhaps this should be added to the constructor?

Indeed the super() call seems to be missing. In order not to change the interface I'd catch the CompoundNotFoundException and rethrow as some sort of unchecked exception (IllegalArgument perhaps?).

In any case, throwing CompoundNotFoundException in AbstractSequence constructor was never very useful in my opinion, but rather inconvenient. In the future I think it'd be a good thing to remove it from the signature.

AccessionID needs to be set to avoid NPEs in several methods; perhaps this should be added to the constructor?

Sounds like a good solution to me

various getXXX () methods (e.g. in GeneSequence) return the underlying collection, that can be modified outside the class. Since GeneSequence maintains 2 collections in sync internally ( a Map and a List) , modifying the returned collection breaks the class invariants. Would suggest returning a immutable lists for these methods (since there are add/remove methods provided by the class).

Sounds good too. Would that be a breaking change?

@richarda23
Copy link
Contributor Author

richarda23 commented Nov 25, 2021

Some comments to the code behaviour:

If a class is instantiated correctly from its constructor, I don't think calls to equals and hashcode should throw NPEs. Methods such as removeXXX() fail as removing an item from a collection calls equals() methods. This problem seems to be caused by not calling super constructor; however doing so would add a CompoundNotFoundException to the method signature
AccessionID needs to be set to avoid NPEs in several methods; perhaps this should be added to the constructor?

Indeed the super() call seems to be missing. In order not to change the interface I'd catch the CompoundNotFoundException and rethrow as some sort of unchecked exception (IllegalArgument perhaps?).

In any case, throwing CompoundNotFoundException in AbstractSequence constructor was never very useful in my opinion, but rather inconvenient. In the future I think it'd be a good thing to remove it from the signature.

Yes, I agree, it's quite annoying to deal with, maybe there can be validateSequence method that asserts a sequence is compliant with a particular compound set, and then we throw IAEs instread. But I will implement your suggestion.

AccessionID needs to be set to avoid NPEs in several methods; perhaps this should be added to the constructor?

Sounds like a good solution to me

I'll add a new constructor to take an accessionID, and deprecate existing constructor . So it will remain compatible

various getXXX () methods (e.g. in GeneSequence) return the underlying collection, that can be modified outside the class. Since GeneSequence maintains 2 collections in sync internally ( a Map and a List) , modifying the returned collection breaks the class invariants. Would suggest returning a immutable lists for these methods (since there are add/remove methods provided by the class).

Sounds good too. Would that be a breaking change?
Hm, if someone is modifying items to the returned array, they would now get an java.lang.UnsupportedOperationException thrown, but the method signature wouldn't change. Alternatively we could return a copy of the list, so adding/removing items to/from the copy wouldn't affect the internal list. That's maybe a better idea. The 'softest' option would just be to document this behaviour and tell clients not to modify returned collections.

@richarda23
Copy link
Contributor Author

that's the 3 sets of changes now. I'll add in a few more tests now those issues are settled.

@richarda23
Copy link
Contributor Author

I've added tests now to cover >=50% method coverage of used classes in o.b.n.core.sequence package - could we move to merging this PR now?

Re the unexpected behaviour reported in the first post to this PR I'd suggest raising these as separate issues.

Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @richarda23 . LGTM.

I'd agree about merging this in ASAP and making it part of 6.0.4

@josemduarte josemduarte merged commit a16d533 into biojava:master Jan 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants