Modernize codebase and remove technical debt #756

ElliotSwart · 2022-08-29T02:59:37Z

Introduction

This is a pull request meant to remove significant technical debt and barriers to entry from the TLA+ / TLC project. This was done for two primary reasons:

To make TLC easier to extend and develop for (difficulties adding functionality to TLC were what initiated this project)
To allow TLC to be building block in a larger system without introducing technical risk

Working remaining

Note that it is not ready to merge instantly for the following reasons:

The TLAToolbox needs to be upgraded to OpenJDK 17
The CI/CD publishing workflow needs to be adjusted a bit to account for differences in the build process

While this can remain a fork and a dependency of my next project, hopefully the improvements provided here prove worth the effort to merge in.

Limitations of this pull request

Ideally each improvement would have been a separate pull request, however the improvements needed to be made iteratively and experimentally.
A code formatter and re-organizer was run on the codebase as the indentation was inconsistent and classes had been built up over time without consolidation. This means integrating upstream changes will be a more manual process.
Java Pathfinder tests could not be run, as even it's latest versions still target Java 8. Furthermore, efforts to port it to a more modern version of java run into significant challenges.

Improved Codebase Standardization for Developers

Moved from Ant to Maven to allow easier builds and upgrades
Removed manual dependency management and converted it entirely to standard Maven.
Integrated code base with VSCode and IntelliJ, while maintaining Eclipse compatibility
Reduced test suite execution from 10min+ -> 2min.
Removed all known memory leaks. Prior the vast majority of allocated non-TLCState/FPSet memory was retained until process exit.
Allow tests to be run and debugged using IDE test runners
Place build outputs in standard target directory
Distinguish tests using JUnit categories, not just filepaths
Allow incremental compilation for Maven
Allow using diagnostic TLCState without starting the debugger, to facilitate easier testing.
Improved ClassLoading of TLA+ modules to be less brittle.

Additional Documentation

The base README.md now serves as a getting started page for developers, as well as a reference for all standard commands. It also contains an index into the other documentation.
docs/CodebaseOverview.md has been added with an architectural walkthrough, coding standards and a semi-comprehensive discussion of codebase idiosyncrasies

Improved Invariants for Debugging

Ensures the entire AST is final before Tool.java is invoked
Heap dumps can be usefully inspected to detect introduction of memory leaks (due to removal of memory leaks)

Bulk removal of static state

Dramatically decreased reliance on static mutable state allowing serial reusable tool execution. This was the prime reason for the test suite execution improvement
Two tests that are not serially reusable are marked as such. Further analysis is likely to result in further improvement.

Standardize Data Structures

Replaced home built data structures with standard equivalents

Code was rewritten to use standard collection classes rather than depending on non-standard implementations, reducing the learning curve and code size, and increasing reliability.

tla2sany.utilities.Vector -> ArrayList
tla2sany.utilities.Stack -> java.util.ArrayDeque
tlc2.util.Vect -> ArrayList
tlc2.util.Set -> java.util.Set
tlc2.util.MemObjectQueue -> java.util.ArrayDeque
tlc2.util.ObjectStack -> java.util.ArrayDeque
tlc2.util.BitVector -> java.util.BitSet
tlc2.util.List -> java.util.LinkedList

Converted to extend standard data-structure

Some data structures with special properties were reimplemented by extending standard collection classes.

tlc2.tool.StateVec extends ArrayList
tlc.tool.liveness.TBGraph extends ArrayList
tlc.tool.liveness.TBPar extends ArrayList
tlc.tool.liveness.TBParVec extends ArrayList

Converted to implement standard interfaces

Some existing data structure extended to implement standard interfaces.

tlc2.util.SetOfStates implements Set
tlc2.tool.ContextEnumerator implements Iterator
tlc.util.LongObjTable is Generic and so can be strongly typed

Improved General Codebase Quality

Metrics

SonarQube:

Bugs (Really just warnings): 466 -> 302
Security Hotspots: 81 -> 64
Code Smells: 14k -> 7.5k
5% reduction in code size / complexity

Compiler:
Warnings: 1000 -> 4

Details

This is a partial list of general quality improvements. Many of them are not reflected in metrics.
Suggestions from static analysis tools (SonarQube, IntelliJ) that were feasible were implemented.

Removed unnecessary deprecated feature usage: Remaining deprecated features detailed in docs/CodebaseOverview.md.
Upgraded to more modern java language features when appropriate
- Textblocks
- Enhanced for loops
- Enhanced switch statements
- Pattern matching if statements
- Autocloseable interface added when appropriate and used with resource-try block whenever possible
- Replaced synchronized data structures with more modern unsynchronized ones where appropriate
Replaced string concatenation with StringBuilders
Fixed misc bugs not originally exercised by test suite
Removed dead code
Replaced statements with simplified and more semantic statements
Removed redundant modifiers from methods
Added final modifier whenever appropriate.
Strongly type generics
Removed unused imports
Simplified code by removing unnecessary casts
Removed duplicate exceptions
Consolidated switch blocks
Improved code formatting and organization
Removed duplicate semi-colons
Fixed unnecessary unboxing
Some JavaDoc fixes
Simplified logic
Used more semantic test assertions

Recommendations

see Codebase Overview: docs/CodebaseOverview.md

- Moved from Ant to Maven to allow easier builds and upgrades - Removed manual dependency management and converted it entirely to standard Maven. - Integrated code base with VSCode and IntelliJ, while maintaining Eclipse compatibility - Reduced test suite execution from 10min+ -> 2min. - Removed all known memory leaks. Prior the vast majority of allocated non-TLCState/FPSet memory was retained until process exit. - Allow tests to be run and debugged using IDE test runners - Place build outputs in standard target directory - Distinguish tests using JUnit categories, not just filepaths - Allow incremental compilation for Maven - Allow using diagnostic TLCState without starting the debugger, to facilitate easier testing. - Improved ClassLoading of TLA+ modules to be less brittle. - The base README.md now serves as a getting started page for developers, as well as a reference for all standard commands. It also contains an index into the other documentation. - docs/CodebaseOverview.md has been added with an architectural walkthrough, coding standards and a semi-comprehensive discussion of codebase idiosyncrasies - Ensures the entire AST is final before Tool.java is invoked - Heap dumps can be usefully inspected to detect introduction of memory leaks (due to removal of memory leaks) - Dramatically decreased reliance on static mutable state allowing serial reusable tool execution. This was the prime reason for the test suite execution improvement - Two tests that are not serially reusable are marked as such. Further analysis is likely to result in further improvement. Code was rewritten to use standard collection classes rather than depending on non-standard implementations, reducing the learning curve and code size, and increasing reliability. - tla2sany.utilities.Vector -> ArrayList - tla2sany.utilities.Stack -> java.util.ArrayDeque - tlc2.util.Vect -> ArrayList - tlc2.util.Set -> java.util.Set - tlc2.util.MemObjectQueue -> java.util.ArrayDeque - tlc2.util.ObjectStack -> java.util.ArrayDeque - tlc2.util.BitVector -> java.util.BitSet - tlc2.util.List -> java.util.LinkedList Some data structures with special properties were reimplemented by extending standard collection classes. - tlc2.tool.StateVec extends ArrayList - tlc.tool.liveness.TBGraph extends ArrayList - tlc.tool.liveness.TBPar extends ArrayList - tlc.tool.liveness.TBParVec extends ArrayList Some existing data structure extended to implement standard interfaces. - tlc2.util.SetOfStates implements Set<TLCState> - tlc2.tool.ContextEnumerator implements Iterator<Context> - tlc.util.LongObjTable is Generic and so can be strongly typed SonarQube: - Bugs (Really just warnings): 466 -> 302 - Security Hotspots: 81 -> 64 - Code Smells: 14k -> 7.5k - 5% reduction in code size / complexity Compiler: Warnings: 1000 -> 4 This is a partial list of general quality improvements. Many of them are not reflected in metrics. Suggestions from static analysis tools (SonarQube, IntelliJ) that were feasible were implemented. - Removed unnecessary deprecated feature usage: Remaining deprecated features detailed in docs/CodebaseOverview.md. - Upgraded to more modern java language features when appropriate - Textblocks - Enhanced for loops - Enhanced switch statements - Pattern matching if statements - Autocloseable interface added when appropriate and used with resource-try block whenever possible - Replaced synchronized data structures with more modern unsynchronized ones where appropriate - Replaced string concatenation with StringBuilders - Fixed misc bugs not originally exercised by test suite - Removed dead code - Replaced statements with simplified and more semantic statements - Removed redundant modifiers from methods - Added final modifier whenever appropriate. - Strongly type generics - Removed unused imports - Simplified code by removing unnecessary casts - Removed duplicate exceptions - Consolidated switch blocks - Improved code formatting and organization - Removed duplicate semi-colons - Fixed unnecessary unboxing - Some JavaDoc fixes - Simplified logic - Used more semantic test assertions

ElliotSwart · 2022-08-29T03:09:07Z

@lemmy my apologies for the giant request. Wanted to make sure the full test suite would pass / there were sufficient improvements to justify the additional integration work required.

Calvin-L · 2022-08-30T01:11:13Z

Hello, and thank you for the herculean effort you've put into this! There is no question that TLA+'s aging code is a hurdle for new contributors, and that there are many ways in which we can improve it. These changes would be a big step in the right direction.

That said, I don't think we can accept this PR as-is. :( There are simply too many changes with too many different goals. If we are ever to review all of it, the changes must be broken up into more manageable units.

Rather than respond to the myriad of specific bullet points above, I'm going to suggest a general path forward:

We need to agree on what's important. Historically, our projects wishlist has not included any code quality improvements, and if you are going to help us make some, I want the work to be well-justified. Ideally we would codify our priorities in that wishlist, and you would be addressing things we have already spent time thinking about. Deciding on what's important is Lame and Boring because it does not involve any actual programming, and it will probably take a long time and involve many rounds of discussion, but if we don't do it I am worried that we will waste time making and reviewing risky changes that ultimately aren't impactful.
Based on what we come up with in step 1, the important changes here can be pulled out and applied in small PRs. I hope that this will be easy, since you have already put together this reference for what the code might look like. :)

Here are some tentative thoughts about point 1---including both changes we might want to prioritize and changes we might want to explicitly refrain from. @lemmy likely has his own opinions about these as well.

Improved test coverage - YES

You wrote: "Fixed misc bugs not originally exercised by test suite".

I would like to know what those bugs were, and get your tests and fixes merged quickly. Bugfixes have always been a high priority for TLA+.

As you likely discovered, the TLA+ test suite has very poor coverage. I think we should make make improved test coverage an explicit priority. This is a very low-risk activity that virtually any developer can help with.

Internal documentation - YES

I am incredibly happy to see that you wrote a document about the state of the code, including idiosyncracies you found. Regardless of the fate of the other changes here, I think this should be merged soon. Our contribution documents do not mention non-programming ways to contribute the project, like writing documentation, and I think they should.

Build system improvements - YES

I am broadly in favor of this. Keeping libraries in the TLA+ source tree seems much more difficult to maintain than declaring dependencies in a build file, and right now the nonstandard Ant/Maven combination makes it harder to set up the project in IDEs like IntelliJ which do not understand our nonstandard layout.

Upgrading to JDK 17 - NO

Markus and I both think this is a bad idea at the moment. It's hard to predict the effect of this on downstream users of TLA+; potentially all of them would have to make some change to their systems. While I'm excited about the new language features like pattern-matching if, I think we are safer sticking with JDK 11 as long as we can.

Code formatting improvements - NO (er, not all at once)

TLA+'s weird indentation is certainly distracting, but I don't feel good about a monolithic project-wide reformatting commit. I would rather see this addressed incrementally... Perhaps we should make an effort to correct the indentation of any lines we change as part of ordinary features and bugfixes. I don't think the tabs and spaces in old and stable code are doing any harm.

Removal of global mutable variables - YES, very slowly

TLC has always been intended to run as an isolated process, not as a library. This is a deliberate choice and a strength: your operating system provides safe memory isolation in case TLC runs out of memory and crashes. In that light, it's not obvious that the "memory leaks" you found and fixed really needed fixing.

This is also a somewhat dangerous endeavor; Markus says:

Simon Zambrowski spent more than a month trying to remove all static initialization in ~2009. At some point, Leslie decided not to waste more time on it because they could not find and refactor all places to correctly re-initialize TLC; for some specs, TLC became unsound and incomplete. We've been cautioning everybody who has tried to tackle this problem since.

That said, the improvement you got to test suite execution time is very real and, in my opinion, provides a very strong argument in favor of trying again. I run the tests as part of every change and I would benefit from execution time reduction. Improving their speed could also make new contributors more likely to run the tests frequently as part of their development workflow.

I do not think we should try fixing this all-at-once, but I do think we should make an effort to incrementally remove instances of this pattern where it is obviously safe to do so, with careful testing and review.

Replacing home-built data structures - YES, slowly

This is something others have attempted as well (e.g. #328), but I think previous attempts (and yours) are far to aggressive and carry a big risk of introducing subtle bugs. See e.g. this insightful comment about a difference between tla2sany.utilities.Vector and java.util.ArrayList.

The correct way to go about this, in my opinion, is something along the lines of:

Pick one data structure to replace (e.g. tla2sany.utilities.Vector).
Write many tests for the existing implementation. Use the existing code to discover weird behaviors (like the contains() behavior above), and cover as many of them as you can find.
Make the existing implementation extend its java.util equivalent, and change the implementation to delegate to super, ensuring that all the tests still pass.
Ensure that the changes do not impact performance.
Optionally, replace usages of the custom type (e.g. tla2sany.utilities.Vector) with uses of the now-more-general type (e.g. java.util.List).

That said, I am in favor of this, for more reasons than even the ones you gave. The java standard library is less likely to contain weird corner cases than our own code, and future JVM optimizations are more likely to improve the performance of the standard collections than the performance of our custom ones.

Removing use of of soon-to-be-deprecated libraries -- YES, low-priority

As part of your migration to JDK 17, I bet you fixed e.g. a bunch of calls to new Integer(...), which is officially deprecated in recent JDKs. I approve of this kind of change and similar future-proofing, but it is something we can afford to do incrementally over years, not an urgent thing we must do tomorrow.

ElliotSwart · 2022-08-30T06:58:30Z

Totally understand. I will use this fork for my efforts (which require TLC to behave more as a library, and less as a process), and of course feel free to merge whatever pieces of the documentation and pom file changes you deem relevant and/or use this as reference. I also ran the code formatter on the last day of the refactor, so I can provide an unformatted branch to diff against (to identify removed static state, etc).

The process of essentially removing the static state required extensive codebase modifications in of itself. As for the "memory leaks", removing them really just helps when debugging a heap dump, and of course running the tests on the same jvm. Overall the process was like peeling an onion with many layers. There are many hidden assumptions, race conditions, circular dependencies that all interact with each other. It may be possible to do incrementally, though I would imagine the engineering effort required to do this may be prohibitive. Perhaps referencing the unformatted version of this branch may shave some time off the effort.

Sadly most of the benefits really are "all or nothing". Without removing virtually all of the static state and allowing the garbage collector to collect prior TLC information, running the tests in the same fork isn't possible. Frankly, for 95% of the effort, I was not sure if it was possible to actually get it to work, which is why I implemented every code-quality fix at my disposal to try to aid the effort (and the repeated static analysis -> fix cycle did prove essential in tracking down the issues). It went nearly instantly from nothing working serially on the same jvm to almost everything working once every piece of state had been eliminated or controlled for. Maven's surefire test runner is noticeably slower than ant's, especially when many forks are involved (I believe it has a much more involved wrapper function), so while the maven standardization is essentially independent of the code refactoring (except for the JUnit categorization), you may find it not worth the effort.

I will see if I can find some of the misc bugs, though since they are not tested by the test suite I suspect many would fall under: "behavior does not match naming or description of the function". From a strict change management perspective I'm not sure they would count as "bugs" as such. Fixing them generally causes cascading failures. To give some examples:

EMPTY state / action variables generally get modified during TLC runs, and aspects of that behavior are relied upon
There is a reliance on the exact array copy mechanics of the grow methods in the usage of some proprietary data structures.
close functions that didn't close the resources / close functions used improperly (which it sounds like isn't a concern)

I guess my only question is about the concern of making specifications unsound. Certainly I observed that unsoundness in the test suite until (seemingly) every static variable was accounted for. Is there any corpus of TLA+ specifications that would ideally be run against the program as another "long" test suite? In your normal change management process are there additional verification steps you would run / usage feedback you would solicit?

Calvin-L · 2022-08-31T19:30:32Z

In that case, would you be willing to help us pull out the relevant parts of this PR? :) I will do some thinking and propose some updates to our projects wishlist.

I can provide an unformatted branch to diff against

That would be a great start---although I'm worried it will still be a lot to sift through.

behavior does not match naming or description of the function
EMPTY state / action variables generally get modified
reliance on the exact array copy mechanics
close functions that didn't close the resources

I see. We should probably still review any weird behaviors you found. They might be benign if the the calling code is carefully structured---but with careful analysis we might find that we can "reverse-engineer" inputs that turn these into user-facing errors.

Is there any corpus of TLA+ specifications that would ideally be run against the program as another "long" test suite?

Markus will know more about this than me. Because running TLC multiple times in the same JVM has not been a priority, I am guessing we don't have any tests to check that functionality, and we would have to write something new.

ElliotSwart · 2022-08-31T20:44:14Z

Hi Calvin,

Yeah, tbh after having done it I think removing the state using a configuration management (and not test driven) process would be, if not impossible, likely prohibitive.

if you think knowing about those weird behaviors would help you, totally happy to point them out!

Multiple users modify the empty state rather than copying it. IsEmpty checks the reference to empty, not that the state is empty.

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/tool/TLCState.java

Line 39 in ee22217

public static TLCState Empty = null;

This condition modifies the Empty action.

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/tool/impl/Tool.java

Line 426 in eff4e9b

acts.setAction(elem);

Both StateVec and LongVec have a non-standard remove that shifts the location of the last element. They rely on each other removing elements in the same way. It is more efficient, so I kept the behavior, just labeled clearly.

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/tool/DFIDWorker.java

Line 111 in d977051

succStates.removeElement(index);

Vect concat creates a new reference to a vector, hiding later modification to the vector

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/util/Vect.java

Line 71 in ee22217

public final Vect<E> concat(Vect<E> elems) {

A reliance on clone is covered up here:

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/tool/impl/Tool.java

Line 216 in ee22217

final Vect<Action> initAndNext = getInitStateSpec().concat(actionVec);

Also missing here:

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/tool/impl/Tool.java

Line 412 in ee22217

Vect<Action> init = this.getInitStateSpec();

While I didn't track down the exact issues hidden by ensure capacity and grow implementations, additional .clone() methods were needed to fix this issue

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/util/Vect.java

Line 106 in ee22217

public final void ensureCapacity(int minCapacity) {

SecurityManager is terminally deprecated, and soon users with new JVMs will not be able to run the test suite. This might be ok.

ElliotSwart · 2022-08-31T20:47:59Z

I'll see if I can find more examples. I systematically found unclosed resources with static analysis. Most of them just didn't close in a finally (or with a resource try like I converted them to), but some were missing close statements. Here is a partial list of files with improperly closed resources:
tlatools/org.lamport.tlatools/src/model/InJarFilenameToStream.java
tlatools/org.lamport.tlatools/src/tla2tex/TeX.java
tlatools/org.lamport.tlatools/src/tlc2/REPL.java
tlatools/org.lamport.tlatools/src/tlc2/tool/TLCTrace.java
tlatools/org.lamport.tlatools/src/tlc2/tool/Worker.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/DiskFPSet.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/MemFPSet.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/MemFPSet1.java
tlatools/org.lamport.tlatools/src/tlc2/tool/fp/dfid/MemFPIntSet.java
tlatools/org.lamport.tlatools/src/tlc2/tool/liveness/AbstractDiskGraph.java
tlatools/org.lamport.tlatools/src/tlc2/tool/liveness/LiveWorker.java
tlatools/org.lamport.tlatools/src/tlc2/tool/queue/DiskByteArrayQueue.java
tlatools/org.lamport.tlatools/src/tlc2/util/BufferedRandomAccessFile.java
tlatools/org.lamport.tlatools/src/tlc2/util/DiskObjectStack.java
tlatools/org.lamport.tlatools/src/tlc2/util/LongVec.java
tlatools/org.lamport.tlatools/src/tlc2/util/MemIntQueue.java
tlatools/org.lamport.tlatools/src/tlc2/util/MemObjectQueue.java
tlatools/org.lamport.tlatools/src/tlc2/util/MemObjectStack.java
tlatools/org.lamport.tlatools/src/tlc2/util/SynchronousDiskIntStack.java
tlatools/org.lamport.tlatools/src/util/InternTable.java
tlatools/org.lamport.tlatools/src/util/MonolithSpecExtractor.java
tlatools/org.lamport.tlatools/src/util/SimpleFilenameToStream.java

JMX resources are not unregistered in close methods, just partially closed

ElliotSwart · 2022-08-31T23:00:46Z

GetInstallationBaseDir can also find the test class dir of the same name, which can be relavent with testing distribution

tlaplus/tlatools/org.lamport.tlatools/src/util/SimpleFilenameToStream.java

Line 77 in ee22217

final URL url = cl.getResource("tla2sany");

ElliotSwart · 2022-08-31T23:04:03Z

This test value should be 1, but due to irregularities in the BitVector implementation must be set to two.

tlaplus/tlatools/org.lamport.tlatools/test/tlc2/tool/distributed/fp/DynamicFPSetManagerTest.java

Line 408 in ee22217

assertEquals(2, bvs[0].trueCnt()); // fingerprint is unknown after fpset crash

ElliotSwart · 2022-08-31T23:07:06Z

tlatools/org.lamport.tlatools/test/pcal/PcalPaxosTest.java has empty test values to check. Not sure how it passes on main branch (assume it's likely excluded)

ElliotSwart · 2022-08-31T23:08:58Z

There is no logging from PCal tests because tool mode is reset, but not set again to ToolIO.System as is standard for other tests

tlaplus/tlatools/org.lamport.tlatools/test/pcal/PCalModelCheckerTestCase.java

Line 72 in ee22217

ToolIO.reset();

ElliotSwart · 2022-08-31T23:11:49Z

The following tests were not named to conform to the standard test discovery pattern, and so may not have been run (these are the renamed names, but remove "Test" from the end and you should be able to spot them easily:

tlatools/org.lamport.tlatools/test/tlc2/tool/AssertExpressionStackTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/DepthFirstTerminateTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/DistributedTraceTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/LoopForcedPartialTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/LoopForcedPartialTest_TTraceTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/LoopWeakFairTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/SymmetryModelCheckerTestLongaTest.java
tlatools/org.lamport.tlatools/test/tlc2/tool/liveness/SymmetryModelCheckerTestLongTest.java

ElliotSwart · 2022-08-31T23:14:36Z

The termination timers on the AbstractChecker and Simulators may cause null reference exceptions when cancelled, as they are not always set

ElliotSwart · 2022-08-31T23:16:29Z

This is an infinite loop that doesn't update next

tlaplus/tlatools/org.lamport.tlatools/src/tlc2/tool/FingerprintException.java

Line 113 in ee22217

while (next != null && next.value != null) {

ElliotSwart · 2022-08-31T23:21:11Z

All of these files have numerous string comparisons using "==" rather than .equals. No idea how this was working previously (though it clearly was)

tlatools/org.lamport.tlatools/src/pcal/PcalSymTab.java
tlatools/org.lamport.tlatools/src/pcal/PcalTranslate.java
tlatools/org.lamport.tlatools/src/tlc2/TraceExplorationSpec.java
tlatools/org.lamport.tlatools/src/tlc2/tool/Simulator.java

ElliotSwart · 2022-08-31T23:24:56Z

I would say that covers the bulk of the weird behavior. There was also a refactor of the test directory system to allow more standard locations for test-models, TESpecs etc, but while brittle, the code as written should work on your implementation

lemmy · 2022-08-31T23:33:30Z

Markus will know more about this than me. Because running TLC multiple times in the same JVM has not been a priority, I am guessing we don't have any tests to check that functionality, and we would have to write something new.

TLC was designed for process-level isolation from the start. A decade later, library-level isolation was briefly considered during the early design stages of the Toolbox. However, they found that process isolation has the nice property of preventing Toolbox crashes when TLC dies (e.g. due to OOM). This means that there is no set of tests that checks if TLC behaves correctly under library isolation. You could/should start validating if TLC (with library isolation) behaves the same for all specs at https://github.com/tlaplus/examples. However, I'm not sure that the benefits outweigh the risks of upstreaming library isolation.

ElliotSwart · 2022-08-31T23:41:37Z

Definitely agree that process isolation is super helpful in 99% of cases, and it was not my recommendation that you modify the toolbox, etc to remove that assurance. However it is helpful for testing, code clarity, and debugging using the heap (which helped identify many of the issues above). This release in no way breaks the cli or the models that have already been used. It simply adds the option for if/when it would be helpful.

From a "code quality" perspective, the main challenge is many classes are hybrids of static and non-static properties, divided by when they were programmed, not a cohesive standard. This breaks a lot of things that would be expected to work due to breaking standard object oriented coding contracts. Being able to work as a library also acts as a proxy for how data flows through your application. Its non-standard properties will likely make onboarding new devs onto the project very difficult. Once of the major improvements of this effort was removing the "silly" coupling and unnecessary brittleness. This allows the more structural idiosyncrasies to be documented, understood and worked with.

With that being said, please let me know if there are any particular features of the pull request you would like to try to get in. While I may not know how to do it myself, I'd be happy to strategize.

As far as next steps, the other thing that I didn't realize (but should have asked about) is the fact that you both don't trust the test suite sufficiently to allow a refactor. As I work on my fork, that does of course raise some concerns about the accuracy of the output. I'm contemplating working to consolidate the following tests into a more standardized, regression test like format, acting on the jars. @lemmy already has done a lot of work in this direction with the top two repos. I figure potentially running on most of the publicly available corpus would help provide (at least me) sufficient assurance that behavior changes would not occur. As it would be done on the cli level, it might also be helpful for you to further validate new releases. Let me know if you are aware of any major pitfalls in that direction.

https://github.com/tlaplus/Examples
https://github.com/lemmy/tlc-perf
https://github.com/pingcap/tla-plus
https://github.com/Apress/practical-tla-plus
https://github.com/informalsystems/apalache/tree/main/test/tla
https://github.com/Azure/azure-cosmos-tla
https://github.com/crytic/whipstaff
https://github.com/duerrfk/skp
https://medium.com/@polyglot_factotum/modelling-a-message-passing-bug-with-tla-baaf090a688d
https://surfingcomplexity.blog/2017/10/16/the-tortoise-and-the-hare-in-tla/
https://www.hillelwayne.com/post/tla-messages/
https://www.hillelwayne.com/post/augmenting-agile/
https://www.hillelwayne.com/post/adversaries/
http://roscidus.com/blog/blog/2019/01/01/using-tla-plus-to-understand-xen-vchan/
https://medium.com/espark-engineering-blog/formal-methods-in-practice-8f20d72bce4f
https://andy.hammerhartes.de/finding-bugs-in-systems-through-formalization.html
https://www.g9labs.com/2022/03/12/what-s-the-fuss-about-formal-specifications-part-2/
https://lamport.azurewebsites.net/tla/PConProof.tla

As for validating the multiple tests, same jvm, I have found by running 500 or so tests in the same JVM, many issues do get revealed. I'm not entirely sure what level of testing beyond that would be required. There are still unfortunately about 30 tests that cannot be run in the same jvm as others, which definitely suggests that while in most cases it should be fine, there still may be outliers.

lemmy · 2022-09-01T00:35:58Z

Every correction of and addition to the test suite is welcomed. For example, I've been absorbing specs into our test suite if the spec's license allows it.

@ElliotSwart

In Java, == compares pointers. Two `String` instances can have the same characters but live at different locations in memory. Therefore, strings should always be compared using `a.equals(b)`, not `a == b`. There are exceptions to the rule: if two strings are "interned" (using `str.intern()`) then they can be safely compared using ==. All string literals written in the program text are guaranteed to be interned, but strings from user input are not, even if they are equal to a string literal. Since there are no calls to `.intern()` in the TLA+ source code that I can find, we ought to fix these dangerous comparisons. However, it is worth noting that these have not caused any (reported) issues so far, and may actually be safe if the strings being compared always point to program string literals. There may be instances of this pattern that I have not caught. I identified the locations to fix using IntelliJ's built-in string comparison inspection. There are limitations to this analysis! If two strings are cast to `Object` before the comparison, the analyis will not detect a problem. This issue was originally reported and fixed by @ElliotSwart in pull request #756.

@ElliotSwart

In Java, == compares pointers. Two `String` instances can have the same characters but live at different locations in memory. Therefore, strings should always be compared using `a.equals(b)`, not `a == b`. There are exceptions to the rule: if two strings are "interned" (using `str.intern()`) then they can be safely compared using ==. All string literals written in the program text are guaranteed to be interned, but strings from user input are not, even if they are equal to a string literal. Since there are no calls to `.intern()` in the TLA+ source code that I can find, we ought to fix these dangerous comparisons. However, it is worth noting that these have not caused any (reported) issues so far, and may actually be safe if the strings being compared always point to program string literals. There may be instances of this pattern that I have not caught. I identified the locations to fix using IntelliJ's built-in string comparison inspection. There are limitations to this analysis! If two strings are cast to `Object` before the comparison, the analyis will not detect a problem. This issue was originally reported and fixed by @ElliotSwart in pull request #756.

This commit makes one iota of progress toward the suggestions in #756 by replacing uses of `.clone()` with copy constructors, adding new copy constructors as necessary, and removing `implements Cloneable` from all types that have it. In my mind, these changes are prerequisites for removing the tools' custom data structures (e.g. `util.Set` and `tlc2.util.Vect`). The `clone()` method has widely been considered bad practice for many years [1]. Problems include: - Lack of type safety - A vague contract - "Magical" capabilities that can't be replicated in pure Java Copy constructors are considered a better pattern. [1]: https://www.artima.com/articles/josh-bloch-on-design

This commit makes one iota of progress toward the suggestions in #756 by replacing uses of `.clone()` with copy constructors, adding new copy constructors as necessary, and removing `implements Cloneable` from all types that have it. In my mind, these changes are prerequisites for removing the tools' custom data structures (e.g. `util.Set` and `tlc2.util.Vect`). The `clone()` method has widely been considered bad practice for many years [1]. Problems include: - Lack of type safety - A vague contract - "Magical" capabilities that can't be replicated in pure Java Copy constructors are considered a better pattern. [1]: https://www.artima.com/articles/josh-bloch-on-design Signed-off-by: Calvin Loncaric <calvin.loncaric@oracle.com>

ElliotSwart changed the title ~~Refactor TLA+~~ Modernize codebase and removal technical debt Aug 29, 2022

ElliotSwart changed the title ~~Modernize codebase and removal technical debt~~ Modernize codebase and remove technical debt Aug 29, 2022

Calvin-L mentioned this pull request Sep 10, 2022

Fix some instances of string comparison using == #758

Merged

lemmy force-pushed the master branch from 8a0da69 to 6dfec66 Compare December 20, 2022 22:52

Calvin-L mentioned this pull request Jan 28, 2023

Add documentation and code quality improvements to the wishlist #782

Merged

lemmy force-pushed the master branch from 5e075ac to 6a73df7 Compare April 20, 2023 22:28

lemmy force-pushed the master branch from c6a8452 to ddec63c Compare August 25, 2023 00:46

Calvin-L mentioned this pull request Oct 23, 2023

Remove uses and overrides of Object.clone() #836

Merged

lemmy mentioned this pull request Feb 6, 2024

Normalize line endings as LF or CRLF and encode choice in .gitattributes file #872

Open

lemmy mentioned this pull request Mar 18, 2024

Proposed refactoring goal: remove all static global variables #891

Open

Calvin-L mentioned this pull request Apr 2, 2024

Feature proposal: parallelize tlatools unit tests #901

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modernize codebase and remove technical debt #756

Modernize codebase and remove technical debt #756

ElliotSwart commented Aug 29, 2022

ElliotSwart commented Aug 29, 2022

Calvin-L commented Aug 30, 2022

ElliotSwart commented Aug 30, 2022

Calvin-L commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022 •

edited

ElliotSwart commented Aug 31, 2022 •

edited

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022 •

edited

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

lemmy commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022 •

edited

lemmy commented Sep 1, 2022

Modernize codebase and remove technical debt #756

Are you sure you want to change the base?

Modernize codebase and remove technical debt #756

Conversation

ElliotSwart commented Aug 29, 2022

Introduction

Working remaining

Limitations of this pull request

Improved Codebase Standardization for Developers

Additional Documentation

Improved Invariants for Debugging

Bulk removal of static state

Standardize Data Structures

Replaced home built data structures with standard equivalents

Converted to extend standard data-structure

Converted to implement standard interfaces

Improved General Codebase Quality

Metrics

Details

Recommendations

ElliotSwart commented Aug 29, 2022

Calvin-L commented Aug 30, 2022

Improved test coverage - YES

Internal documentation - YES

Build system improvements - YES

Upgrading to JDK 17 - NO

Code formatting improvements - NO (er, not all at once)

Removal of global mutable variables - YES, very slowly

Replacing home-built data structures - YES, slowly

Removing use of of soon-to-be-deprecated libraries -- YES, low-priority

ElliotSwart commented Aug 30, 2022

Calvin-L commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022 • edited

ElliotSwart commented Aug 31, 2022 • edited

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022 • edited

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022

lemmy commented Aug 31, 2022

ElliotSwart commented Aug 31, 2022 • edited

lemmy commented Sep 1, 2022

ElliotSwart commented Aug 31, 2022 •

edited

ElliotSwart commented Aug 31, 2022 •

edited

ElliotSwart commented Aug 31, 2022 •

edited

ElliotSwart commented Aug 31, 2022 •

edited