Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize codebase and remove technical debt #756

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ElliotSwart
Copy link

Introduction

This is a pull request meant to remove significant technical debt and barriers to entry from the TLA+ / TLC project. This was done for two primary reasons:

  • To make TLC easier to extend and develop for (difficulties adding functionality to TLC were what initiated this project)
  • To allow TLC to be building block in a larger system without introducing technical risk

Working remaining

Note that it is not ready to merge instantly for the following reasons:

  • The TLAToolbox needs to be upgraded to OpenJDK 17
  • The CI/CD publishing workflow needs to be adjusted a bit to account for differences in the build process

While this can remain a fork and a dependency of my next project, hopefully the improvements provided here prove worth the effort to merge in.

Limitations of this pull request

  • Ideally each improvement would have been a separate pull request, however the improvements needed to be made iteratively and experimentally.
    A code formatter and re-organizer was run on the codebase as the indentation was inconsistent and classes had been built up over time without consolidation. This means integrating upstream changes will be a more manual process.

  • Java Pathfinder tests could not be run, as even it's latest versions still target Java 8. Furthermore, efforts to port it to a more modern version of java run into significant challenges.

Improved Codebase Standardization for Developers

  • Moved from Ant to Maven to allow easier builds and upgrades
  • Removed manual dependency management and converted it entirely to standard Maven.
  • Integrated code base with VSCode and IntelliJ, while maintaining Eclipse compatibility
  • Reduced test suite execution from 10min+ -> 2min.
  • Removed all known memory leaks. Prior the vast majority of allocated non-TLCState/FPSet memory was retained until process exit.
  • Allow tests to be run and debugged using IDE test runners
  • Place build outputs in standard target directory
  • Distinguish tests using JUnit categories, not just filepaths
  • Allow incremental compilation for Maven
  • Allow using diagnostic TLCState without starting the debugger, to facilitate easier testing.
  • Improved ClassLoading of TLA+ modules to be less brittle.

Additional Documentation

  • The base README.md now serves as a getting started page for developers, as well as a reference for all standard commands. It also contains an index into the other documentation.
  • docs/CodebaseOverview.md has been added with an architectural walkthrough, coding standards and a semi-comprehensive discussion of codebase idiosyncrasies

Improved Invariants for Debugging

  • Ensures the entire AST is final before Tool.java is invoked
  • Heap dumps can be usefully inspected to detect introduction of memory leaks (due to removal of memory leaks)

Bulk removal of static state

  • Dramatically decreased reliance on static mutable state allowing serial reusable tool execution. This was the prime reason for the test suite execution improvement
  • Two tests that are not serially reusable are marked as such. Further analysis is likely to result in further improvement.

Standardize Data Structures

Replaced home built data structures with standard equivalents

Code was rewritten to use standard collection classes rather than depending on non-standard implementations, reducing the learning curve and code size, and increasing reliability.

  • tla2sany.utilities.Vector -> ArrayList
  • tla2sany.utilities.Stack -> java.util.ArrayDeque
  • tlc2.util.Vect -> ArrayList
  • tlc2.util.Set -> java.util.Set
  • tlc2.util.MemObjectQueue -> java.util.ArrayDeque
  • tlc2.util.ObjectStack -> java.util.ArrayDeque
  • tlc2.util.BitVector -> java.util.BitSet
  • tlc2.util.List -> java.util.LinkedList

Converted to extend standard data-structure

Some data structures with special properties were reimplemented by extending standard collection classes.

  • tlc2.tool.StateVec extends ArrayList
  • tlc.tool.liveness.TBGraph extends ArrayList
  • tlc.tool.liveness.TBPar extends ArrayList
  • tlc.tool.liveness.TBParVec extends ArrayList

Converted to implement standard interfaces

Some existing data structure extended to implement standard interfaces.

  • tlc2.util.SetOfStates implements Set
  • tlc2.tool.ContextEnumerator implements Iterator
  • tlc.util.LongObjTable is Generic and so can be strongly typed

Improved General Codebase Quality

Metrics

SonarQube:

  • Bugs (Really just warnings): 466 -> 302
  • Security Hotspots: 81 -> 64
  • Code Smells: 14k -> 7.5k
  • 5% reduction in code size / complexity

Compiler:
Warnings: 1000 -> 4

Details

This is a partial list of general quality improvements. Many of them are not reflected in metrics.
Suggestions from static analysis tools (SonarQube, IntelliJ) that were feasible were implemented.

  • Removed unnecessary deprecated feature usage: Remaining deprecated features detailed in docs/CodebaseOverview.md.
  • Upgraded to more modern java language features when appropriate
    • Textblocks
    • Enhanced for loops
    • Enhanced switch statements
    • Pattern matching if statements
    • Autocloseable interface added when appropriate and used with resource-try block whenever possible
    • Replaced synchronized data structures with more modern unsynchronized ones where appropriate
  • Replaced string concatenation with StringBuilders
  • Fixed misc bugs not originally exercised by test suite
  • Removed dead code
  • Replaced statements with simplified and more semantic statements
  • Removed redundant modifiers from methods
  • Added final modifier whenever appropriate.
  • Strongly type generics
  • Removed unused imports
  • Simplified code by removing unnecessary casts
  • Removed duplicate exceptions
  • Consolidated switch blocks
  • Improved code formatting and organization
  • Removed duplicate semi-colons
  • Fixed unnecessary unboxing
  • Some JavaDoc fixes
  • Simplified logic
  • Used more semantic test assertions

Recommendations

  • see Codebase Overview: docs/CodebaseOverview.md

- Moved from Ant to Maven to allow easier builds and upgrades
- Removed manual dependency management and converted it entirely to standard Maven.
- Integrated code base with VSCode and IntelliJ, while maintaining Eclipse compatibility
- Reduced test suite execution from 10min+ -> 2min.
- Removed all known memory leaks. Prior the vast majority of allocated non-TLCState/FPSet memory was retained until process exit.
- Allow tests to be run and debugged using IDE test runners
- Place build outputs in standard target directory
- Distinguish tests using JUnit categories, not just filepaths
- Allow incremental compilation for Maven
- Allow using diagnostic TLCState without starting the debugger, to facilitate easier testing.
- Improved ClassLoading of TLA+ modules to be less brittle.

- The base README.md now serves as a getting started page for developers, as well as a reference for all standard commands. It also contains an index into the other documentation.
- docs/CodebaseOverview.md has been added with an architectural walkthrough, coding standards and a semi-comprehensive discussion of codebase idiosyncrasies

- Ensures the entire AST is final before Tool.java is invoked
- Heap dumps can be usefully inspected to detect introduction of memory leaks (due to removal of memory leaks)

- Dramatically decreased reliance on static mutable state allowing serial reusable tool execution.  This was the prime reason for the test suite execution improvement
- Two tests that are not serially reusable are marked as such. Further analysis is likely to result in further improvement.

Code was rewritten to use standard collection classes rather than depending on non-standard implementations, reducing the learning curve and code size, and increasing reliability.

- tla2sany.utilities.Vector -> ArrayList
- tla2sany.utilities.Stack -> java.util.ArrayDeque
- tlc2.util.Vect -> ArrayList
- tlc2.util.Set -> java.util.Set
- tlc2.util.MemObjectQueue -> java.util.ArrayDeque
- tlc2.util.ObjectStack -> java.util.ArrayDeque
- tlc2.util.BitVector -> java.util.BitSet
- tlc2.util.List -> java.util.LinkedList

Some data structures with special properties were reimplemented by extending standard collection classes.

- tlc2.tool.StateVec extends ArrayList
- tlc.tool.liveness.TBGraph extends ArrayList
- tlc.tool.liveness.TBPar extends ArrayList
- tlc.tool.liveness.TBParVec extends ArrayList

Some existing data structure extended to implement standard interfaces.

- tlc2.util.SetOfStates implements Set<TLCState>
- tlc2.tool.ContextEnumerator implements Iterator<Context>
- tlc.util.LongObjTable is Generic and so can be strongly typed

SonarQube:
- Bugs (Really just warnings): 466 -> 302
- Security Hotspots: 81 -> 64
- Code Smells: 14k -> 7.5k
- 5% reduction in code size / complexity

Compiler:
Warnings: 1000 -> 4

This is a partial list of general quality improvements. Many of them are not reflected in metrics.
Suggestions from static analysis tools (SonarQube, IntelliJ) that were feasible were implemented.
- Removed unnecessary deprecated feature usage:  Remaining deprecated features detailed in docs/CodebaseOverview.md.
- Upgraded to more modern java language features when appropriate
    - Textblocks
    - Enhanced for loops
    - Enhanced switch statements
    - Pattern matching if statements
    - Autocloseable interface added when appropriate and used with resource-try block whenever possible
    - Replaced synchronized data structures with more modern unsynchronized ones where appropriate
- Replaced string concatenation with StringBuilders
- Fixed misc bugs not originally exercised by test suite
- Removed dead code
- Replaced statements with simplified and more semantic statements
- Removed redundant modifiers from methods
- Added final modifier whenever appropriate.
- Strongly type generics
- Removed unused imports
- Simplified code by removing unnecessary casts
- Removed duplicate exceptions
- Consolidated switch blocks
- Improved code formatting and organization
- Removed duplicate semi-colons
- Fixed unnecessary unboxing
- Some JavaDoc fixes
- Simplified logic
- Used more semantic test assertions
@ElliotSwart ElliotSwart changed the title Refactor TLA+ Modernize codebase and removal technical debt Aug 29, 2022
@ElliotSwart
Copy link
Author

@lemmy my apologies for the giant request. Wanted to make sure the full test suite would pass / there were sufficient improvements to justify the additional integration work required.

@ElliotSwart ElliotSwart changed the title Modernize codebase and removal technical debt Modernize codebase and remove technical debt Aug 29, 2022
@Calvin-L
Copy link
Collaborator

Hello, and thank you for the herculean effort you've put into this! There is no question that TLA+'s aging code is a hurdle for new contributors, and that there are many ways in which we can improve it. These changes would be a big step in the right direction.

That said, I don't think we can accept this PR as-is. :( There are simply too many changes with too many different goals. If we are ever to review all of it, the changes must be broken up into more manageable units.

Rather than respond to the myriad of specific bullet points above, I'm going to suggest a general path forward:

  1. We need to agree on what's important. Historically, our projects wishlist has not included any code quality improvements, and if you are going to help us make some, I want the work to be well-justified. Ideally we would codify our priorities in that wishlist, and you would be addressing things we have already spent time thinking about. Deciding on what's important is Lame and Boring because it does not involve any actual programming, and it will probably take a long time and involve many rounds of discussion, but if we don't do it I am worried that we will waste time making and reviewing risky changes that ultimately aren't impactful.
  2. Based on what we come up with in step 1, the important changes here can be pulled out and applied in small PRs. I hope that this will be easy, since you have already put together this reference for what the code might look like. :)

Here are some tentative thoughts about point 1---including both changes we might want to prioritize and changes we might want to explicitly refrain from. @lemmy likely has his own opinions about these as well.

Improved test coverage - YES

You wrote: "Fixed misc bugs not originally exercised by test suite".

I would like to know what those bugs were, and get your tests and fixes merged quickly. Bugfixes have always been a high priority for TLA+.

As you likely discovered, the TLA+ test suite has very poor coverage. I think we should make make improved test coverage an explicit priority. This is a very low-risk activity that virtually any developer can help with.

Internal documentation - YES

I am incredibly happy to see that you wrote a document about the state of the code, including idiosyncracies you found. Regardless of the fate of the other changes here, I think this should be merged soon. Our contribution documents do not mention non-programming ways to contribute the project, like writing documentation, and I think they should.

Build system improvements - YES

I am broadly in favor of this. Keeping libraries in the TLA+ source tree seems much more difficult to maintain than declaring dependencies in a build file, and right now the nonstandard Ant/Maven combination makes it harder to set up the project in IDEs like IntelliJ which do not understand our nonstandard layout.

Upgrading to JDK 17 - NO

Markus and I both think this is a bad idea at the moment. It's hard to predict the effect of this on downstream users of TLA+; potentially all of them would have to make some change to their systems. While I'm excited about the new language features like pattern-matching if, I think we are safer sticking with JDK 11 as long as we can.

Code formatting improvements - NO (er, not all at once)

TLA+'s weird indentation is certainly distracting, but I don't feel good about a monolithic project-wide reformatting commit. I would rather see this addressed incrementally... Perhaps we should make an effort to correct the indentation of any lines we change as part of ordinary features and bugfixes. I don't think the tabs and spaces in old and stable code are doing any harm.

Removal of global mutable variables - YES, very slowly

TLC has always been intended to run as an isolated process, not as a library. This is a deliberate choice and a strength: your operating system provides safe memory isolation in case TLC runs out of memory and crashes. In that light, it's not obvious that the "memory leaks" you found and fixed really needed fixing.

This is also a somewhat dangerous endeavor; Markus says:

Simon Zambrowski spent more than a month trying to remove all static initialization in ~2009. At some point, Leslie decided not to waste more time on it because they could not find and refactor all places to correctly re-initialize TLC; for some specs, TLC became unsound and incomplete. We've been cautioning everybody who has tried to tackle this problem since.

That said, the improvement you got to test suite execution time is very real and, in my opinion, provides a very strong argument in favor of trying again. I run the tests as part of every change and I would benefit from execution time reduction. Improving their speed could also make new contributors more likely to run the tests frequently as part of their development workflow.

I do not think we should try fixing this all-at-once, but I do think we should make an effort to incrementally remove instances of this pattern where it is obviously safe to do so, with careful testing and review.

Replacing home-built data structures - YES, slowly

This is something others have attempted as well (e.g. #328), but I think previous attempts (and yours) are far to aggressive and carry a big risk of introducing subtle bugs. See e.g. this insightful comment about a difference between tla2sany.utilities.Vector and java.util.ArrayList.

The correct way to go about this, in my opinion, is something along the lines of:

  1. Pick one data structure to replace (e.g. tla2sany.utilities.Vector).
  2. Write many tests for the existing implementation. Use the existing code to discover weird behaviors (like the contains() behavior above), and cover as many of them as you can find.
  3. Make the existing implementation extend its java.util equivalent, and change the implementation to delegate to super, ensuring that all the tests still pass.
  4. Ensure that the changes do not impact performance.
  5. Optionally, replace usages of the custom type (e.g. tla2sany.utilities.Vector) with uses of the now-more-general type (e.g. java.util.List).

That said, I am in favor of this, for more reasons than even the ones you gave. The java standard library is less likely to contain weird corner cases than our own code, and future JVM optimizations are more likely to improve the performance of the standard collections than the performance of our custom ones.

Removing use of of soon-to-be-deprecated libraries -- YES, low-priority

As part of your migration to JDK 17, I bet you fixed e.g. a bunch of calls to new Integer(...), which is officially deprecated in recent JDKs. I approve of this kind of change and similar future-proofing, but it is something we can afford to do incrementally over years, not an urgent thing we must do tomorrow.

@ElliotSwart
Copy link
Author

Totally understand. I will use this fork for my efforts (which require TLC to behave more as a library, and less as a process), and of course feel free to merge whatever pieces of the documentation and pom file changes you deem relevant and/or use this as reference. I also ran the code formatter on the last day of the refactor, so I can provide an unformatted branch to diff against (to identify removed static state, etc).

The process of essentially removing the static state required extensive codebase modifications in of itself. As for the "memory leaks", removing them really just helps when debugging a heap dump, and of course running the tests on the same jvm. Overall the process was like peeling an onion with many layers. There are many hidden assumptions, race conditions, circular dependencies that all interact with each other. It may be possible to do incrementally, though I would imagine the engineering effort required to do this may be prohibitive. Perhaps referencing the unformatted version of this branch may shave some time off the effort.

Sadly most of the benefits really are "all or nothing". Without removing virtually all of the static state and allowing the garbage collector to collect prior TLC information, running the tests in the same fork isn't possible. Frankly, for 95% of the effort, I was not sure if it was possible to actually get it to work, which is why I implemented every code-quality fix at my disposal to try to aid the effort (and the repeated static analysis -> fix cycle did prove essential in tracking down the issues). It went nearly instantly from nothing working serially on the same jvm to almost everything working once every piece of state had been eliminated or controlled for. Maven's surefire test runner is noticeably slower than ant's, especially when many forks are involved (I believe it has a much more involved wrapper function), so while the maven standardization is essentially independent of the code refactoring (except for the JUnit categorization), you may find it not worth the effort.

I will see if I can find some of the misc bugs, though since they are not tested by the test suite I suspect many would fall under: "behavior does not match naming or description of the function". From a strict change management perspective I'm not sure they would count as "bugs" as such. Fixing them generally causes cascading failures. To give some examples:

  • EMPTY state / action variables generally get modified during TLC runs, and aspects of that behavior are relied upon
  • There is a reliance on the exact array copy mechanics of the grow methods in the usage of some proprietary data structures.
  • close functions that didn't close the resources / close functions used improperly (which it sounds like isn't a concern)

I guess my only question is about the concern of making specifications unsound. Certainly I observed that unsoundness in the test suite until (seemingly) every static variable was accounted for. Is there any corpus of TLA+ specifications that would ideally be run against the program as another "long" test suite? In your normal change management process are there additional verification steps you would run / usage feedback you would solicit?

@Calvin-L
Copy link
Collaborator

In that case, would you be willing to help us pull out the relevant parts of this PR? :) I will do some thinking and propose some updates to our projects wishlist.

I can provide an unformatted branch to diff against

That would be a great start---although I'm worried it will still be a lot to sift through.

behavior does not match naming or description of the function
EMPTY state / action variables generally get modified
reliance on the exact array copy mechanics
close functions that didn't close the resources

I see. We should probably still review any weird behaviors you found. They might be benign if the the calling code is carefully structured---but with careful analysis we might find that we can "reverse-engineer" inputs that turn these into user-facing errors.

Is there any corpus of TLA+ specifications that would ideally be run against the program as another "long" test suite?

Markus will know more about this than me. Because running TLC multiple times in the same JVM has not been a priority, I am guessing we don't have any tests to check that functionality, and we would have to write something new.

@ElliotSwart
Copy link
Author

ElliotSwart commented Aug 31, 2022

Hi Calvin,

Yeah, tbh after having done it I think removing the state using a configuration management (and not test driven) process would be, if not impossible, likely prohibitive.

if you think knowing about those weird behaviors would help you, totally happy to point them out!

Multiple users modify the empty state rather than copying it. IsEmpty checks the reference to empty, not that the state is empty.

public static TLCState Empty = null;

This condition modifies the Empty action.

Both StateVec and LongVec have a non-standard remove that shifts the location of the last element. They rely on each other removing elements in the same way. It is more efficient, so I kept the behavior, just labeled clearly.

Vect concat creates a new reference to a vector, hiding later modification to the vector

public final Vect<E> concat(Vect<E> elems) {

A reliance on clone is covered up here:

final Vect<Action> initAndNext = getInitStateSpec().concat(actionVec);

Also missing here:

Vect<Action> init = this.getInitStateSpec();

While I didn't track down the exact issues hidden by ensure capacity and grow implementations, additional .clone() methods were needed to fix this issue

public final void ensureCapacity(int minCapacity) {

SecurityManager is terminally deprecated, and soon users with new JVMs will not be able to run the test suite. This might be ok.

@ElliotSwart
Copy link
Author

ElliotSwart commented Aug 31, 2022

I'll see if I can find more examples. I systematically found unclosed resources with static analysis. Most of them just didn't close in a finally (or with a resource try like I converted them to), but some were missing close statements. Here is a partial list of files with improperly closed resources:
tlatools/org.lamport.tlatools/src/model/InJarFilenameToStream.java
tlatools/org.lamport.tlatools/src/tla2tex/TeX.java
tlatools/org.lamport.tlatools/src/tlc2/REPL.java
tlatools/org.lamport.tlatools/src/tlc2/tool/TLCTrace.java
tlatools/org.lamport.tlatools/src/tlc2/tool/Worker.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/DiskFPSet.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/MemFPSet.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/MemFPSet1.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/dfid/MemFPIntSet.java
tlatools/org.lamport.tlatools/src/tlc2/tool/liveness/AbstractDiskGraph.java
tlatools/org.lamport.tlatools/src/tlc2/tool/liveness/LiveWorker.java
tlatools/org.lamport.tlatools/src/tlc2/tool/queue/DiskByteArrayQueue.java
tlatools/org.lamport.tlatools/src/tlc2/util/BufferedRandomAccessFile.java
tlatools/org.lamport.tlatools/src/tlc2/util/DiskObjectStack.java
tlatools/org.lamport.tlatools/src/tlc2/util/LongVec.java
tlatools/org.lamport.tlatools/src/tlc2/util/MemIntQueue.java
tlatools/org.lamport.tlatools/src/tlc2/util/MemObjectQueue.java
tlatools/org.lamport.tlatools/src/tlc2/util/MemObjectStack.java
tlatools/org.lamport.tlatools/src/tlc2/util/SynchronousDiskIntStack.java
tlatools/org.lamport.tlatools/src/util/InternTable.java
tlatools/org.lamport.tlatools/src/util/MonolithSpecExtractor.java
tlatools/org.lamport.tlatools/src/util/SimpleFilenameToStream.java

JMX resources are not unregistered in close methods, just partially closed

@ElliotSwart
Copy link
Author

GetInstallationBaseDir can also find the test class dir of the same name, which can be relavent with testing distribution

final URL url = cl.getResource("tla2sany");

@ElliotSwart
Copy link
Author

This test value should be 1, but due to irregularities in the BitVector implementation must be set to two.

assertEquals(2, bvs[0].trueCnt()); // fingerprint is unknown after fpset crash

@ElliotSwart
Copy link
Author

tlatools/org.lamport.tlatools/test/pcal/PcalPaxosTest.java has empty test values to check. Not sure how it passes on main branch (assume it's likely excluded)

@ElliotSwart
Copy link
Author

There is no logging from PCal tests because tool mode is reset, but not set again to ToolIO.System as is standard for other tests

@ElliotSwart
Copy link
Author

ElliotSwart commented Aug 31, 2022

The following tests were not named to conform to the standard test discovery pattern, and so may not have been run (these are the renamed names, but remove "Test" from the end and you should be able to spot them easily:

tlatools/org.lamport.tlatools/test/tlc2/tool/AssertExpressionStackTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/DepthFirstTerminateTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/DistributedTraceTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/LoopForcedPartialTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/LoopForcedPartialTest_TTraceTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/LoopWeakFairTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/SymmetryModelCheckerTestLongaTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/SymmetryModelCheckerTestLongTest.java

@ElliotSwart
Copy link
Author

The termination timers on the AbstractChecker and Simulators may cause null reference exceptions when cancelled, as they are not always set

@ElliotSwart
Copy link
Author

This is an infinite loop that doesn't update next

while (next != null && next.value != null) {

@ElliotSwart
Copy link
Author

All of these files have numerous string comparisons using "==" rather than .equals. No idea how this was working previously (though it clearly was)

tlatools/org.lamport.tlatools/src/pcal/PcalSymTab.java
tlatools/org.lamport.tlatools/src/pcal/PcalTranslate.java
tlatools/org.lamport.tlatools/src/tlc2/TraceExplorationSpec.java
tlatools/org.lamport.tlatools/src/tlc2/tool/Simulator.java

@ElliotSwart
Copy link
Author

I would say that covers the bulk of the weird behavior. There was also a refactor of the test directory system to allow more standard locations for test-models, TESpecs etc, but while brittle, the code as written should work on your implementation

@lemmy
Copy link
Member

lemmy commented Aug 31, 2022

Markus will know more about this than me. Because running TLC multiple times in the same JVM has not been a priority, I am guessing we don't have any tests to check that functionality, and we would have to write something new.

TLC was designed for process-level isolation from the start. A decade later, library-level isolation was briefly considered during the early design stages of the Toolbox. However, they found that process isolation has the nice property of preventing Toolbox crashes when TLC dies (e.g. due to OOM). This means that there is no set of tests that checks if TLC behaves correctly under library isolation. You could/should start validating if TLC (with library isolation) behaves the same for all specs at https://github.com/tlaplus/examples. However, I'm not sure that the benefits outweigh the risks of upstreaming library isolation.

@ElliotSwart
Copy link
Author

ElliotSwart commented Aug 31, 2022

Definitely agree that process isolation is super helpful in 99% of cases, and it was not my recommendation that you modify the toolbox, etc to remove that assurance. However it is helpful for testing, code clarity, and debugging using the heap (which helped identify many of the issues above). This release in no way breaks the cli or the models that have already been used. It simply adds the option for if/when it would be helpful.

From a "code quality" perspective, the main challenge is many classes are hybrids of static and non-static properties, divided by when they were programmed, not a cohesive standard. This breaks a lot of things that would be expected to work due to breaking standard object oriented coding contracts. Being able to work as a library also acts as a proxy for how data flows through your application. Its non-standard properties will likely make onboarding new devs onto the project very difficult. Once of the major improvements of this effort was removing the "silly" coupling and unnecessary brittleness. This allows the more structural idiosyncrasies to be documented, understood and worked with.

With that being said, please let me know if there are any particular features of the pull request you would like to try to get in. While I may not know how to do it myself, I'd be happy to strategize.

As far as next steps, the other thing that I didn't realize (but should have asked about) is the fact that you both don't trust the test suite sufficiently to allow a refactor. As I work on my fork, that does of course raise some concerns about the accuracy of the output. I'm contemplating working to consolidate the following tests into a more standardized, regression test like format, acting on the jars. @lemmy already has done a lot of work in this direction with the top two repos. I figure potentially running on most of the publicly available corpus would help provide (at least me) sufficient assurance that behavior changes would not occur. As it would be done on the cli level, it might also be helpful for you to further validate new releases. Let me know if you are aware of any major pitfalls in that direction.

https://github.com/tlaplus/Examples
https://github.com/lemmy/tlc-perf
https://github.com/pingcap/tla-plus
https://github.com/Apress/practical-tla-plus
https://github.com/informalsystems/apalache/tree/main/test/tla
https://github.com/Azure/azure-cosmos-tla
https://github.com/crytic/whipstaff
https://github.com/duerrfk/skp
https://medium.com/@polyglot_factotum/modelling-a-message-passing-bug-with-tla-baaf090a688d
https://surfingcomplexity.blog/2017/10/16/the-tortoise-and-the-hare-in-tla/
https://www.hillelwayne.com/post/tla-messages/
https://www.hillelwayne.com/post/augmenting-agile/
https://www.hillelwayne.com/post/adversaries/
http://roscidus.com/blog/blog/2019/01/01/using-tla-plus-to-understand-xen-vchan/
https://medium.com/espark-engineering-blog/formal-methods-in-practice-8f20d72bce4f
https://andy.hammerhartes.de/finding-bugs-in-systems-through-formalization.html
https://www.g9labs.com/2022/03/12/what-s-the-fuss-about-formal-specifications-part-2/
https://lamport.azurewebsites.net/tla/PConProof.tla

As for validating the multiple tests, same jvm, I have found by running 500 or so tests in the same JVM, many issues do get revealed. I'm not entirely sure what level of testing beyond that would be required. There are still unfortunately about 30 tests that cannot be run in the same jvm as others, which definitely suggests that while in most cases it should be fine, there still may be outliers.

@lemmy
Copy link
Member

lemmy commented Sep 1, 2022

Every correction of and addition to the test suite is welcomed. For example, I've been absorbing specs into our test suite if the spec's license allows it.

Calvin-L added a commit that referenced this pull request Sep 10, 2022
In Java, == compares pointers.  Two `String` instances can have the
same characters but live at different locations in memory.  Therefore,
strings should always be compared using `a.equals(b)`, not `a == b`.

There are exceptions to the rule: if two strings are "interned" (using
`str.intern()`) then they can be safely compared using ==.  All string
literals written in the program text are guaranteed to be interned, but
strings from user input are not, even if they are equal to a string
literal.

Since there are no calls to `.intern()` in the TLA+ source code that I
can find, we ought to fix these dangerous comparisons.  However, it is
worth noting that these have not caused any (reported) issues so far,
and may actually be safe if the strings being compared always point to
program string literals.

There may be instances of this pattern that I have not caught.  I
identified the locations to fix using IntelliJ's built-in string
comparison inspection.  There are limitations to this analysis!
If two strings are cast to `Object` before the comparison, the analyis
will not detect a problem.

This issue was originally reported and fixed by @ElliotSwart in pull
request #756.
Calvin-L added a commit that referenced this pull request Sep 11, 2022
In Java, == compares pointers.  Two `String` instances can have the
same characters but live at different locations in memory.  Therefore,
strings should always be compared using `a.equals(b)`, not `a == b`.

There are exceptions to the rule: if two strings are "interned" (using
`str.intern()`) then they can be safely compared using ==.  All string
literals written in the program text are guaranteed to be interned, but
strings from user input are not, even if they are equal to a string
literal.

Since there are no calls to `.intern()` in the TLA+ source code that I
can find, we ought to fix these dangerous comparisons.  However, it is
worth noting that these have not caused any (reported) issues so far,
and may actually be safe if the strings being compared always point to
program string literals.

There may be instances of this pattern that I have not caught.  I
identified the locations to fix using IntelliJ's built-in string
comparison inspection.  There are limitations to this analysis!
If two strings are cast to `Object` before the comparison, the analyis
will not detect a problem.

This issue was originally reported and fixed by @ElliotSwart in pull
request #756.
Calvin-L added a commit that referenced this pull request Oct 23, 2023
This commit makes one iota of progress toward the suggestions in #756
by replacing uses of `.clone()` with copy constructors, adding new copy
constructors as necessary, and removing `implements Cloneable` from all
types that have it.  In my mind, these changes are prerequisites for
removing the tools' custom data structures (e.g. `util.Set` and
`tlc2.util.Vect`).

The `clone()` method has widely been considered bad practice for many
years [1].  Problems include:
 - Lack of type safety
 - A vague contract
 - "Magical" capabilities that can't be replicated in pure Java

Copy constructors are considered a better pattern.

[1]: https://www.artima.com/articles/josh-bloch-on-design
Calvin-L added a commit that referenced this pull request Oct 23, 2023
This commit makes one iota of progress toward the suggestions in #756
by replacing uses of `.clone()` with copy constructors, adding new copy
constructors as necessary, and removing `implements Cloneable` from all
types that have it.  In my mind, these changes are prerequisites for
removing the tools' custom data structures (e.g. `util.Set` and
`tlc2.util.Vect`).

The `clone()` method has widely been considered bad practice for many
years [1].  Problems include:
 - Lack of type safety
 - A vague contract
 - "Magical" capabilities that can't be replicated in pure Java

Copy constructors are considered a better pattern.

[1]: https://www.artima.com/articles/josh-bloch-on-design

Signed-off-by: Calvin Loncaric <calvin.loncaric@oracle.com>
Calvin-L added a commit that referenced this pull request Oct 23, 2023
This commit makes one iota of progress toward the suggestions in #756
by replacing uses of `.clone()` with copy constructors, adding new copy
constructors as necessary, and removing `implements Cloneable` from all
types that have it.  In my mind, these changes are prerequisites for
removing the tools' custom data structures (e.g. `util.Set` and
`tlc2.util.Vect`).

The `clone()` method has widely been considered bad practice for many
years [1].  Problems include:
 - Lack of type safety
 - A vague contract
 - "Magical" capabilities that can't be replicated in pure Java

Copy constructors are considered a better pattern.

[1]: https://www.artima.com/articles/josh-bloch-on-design

Signed-off-by: Calvin Loncaric <calvin.loncaric@oracle.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants