Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Add async command framework using Java 21 language features #6518

Draft
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

SamCarlberg
Copy link
Member

There's a lot of WIP code in here that needs to be cleaned up.

I'm opening this PR anyway to get eyes on this and start a discussion. The framework fundamentally relies on virtual threads added in Java 21; in addition, parallel command groups rely on the structured concurrency API added as a preview feature in Java 21 (which is also in a second preview in Java 22). Configuring the JVM to use a single carrier thread for all virtual threads in the process means async commands and triggers can be written without needing multithreaded concepts like locks and atomic field access.

The async framework allows command bodies to be written in an imperative style:

HardwareResource drivetrain = new Drivetrain();

AsyncCommand driveTenFeet =
  drivetrain.run(() -> {
    drivetrain.resetEncoders();
    while (drivetrain.getDistance() < 10) {
      AsyncCommand.pause(); // <- note the pause call!
      drivetrain.tankDrive(1, 1);
    }
    drivetrain.stop();
  }).named("Drive Ten Feet");

Which would look something like this using the current functional framework:

Subsystem drivetrain = new Drivetrain();

Comand driveTenFeet =
  drivetrain.runOnce(drivetrain::resetEncoders)
    .andThen(driveTrain.run(() -> drivetrain.tankDrive(1, 1))
    .until(() -> drivetrain.getDistance() < 10)
    .finallyDo(drivetrain::stop)
    .withName("Drive Ten Feet");

Commands could be written as thin wrappers around regular imperative methods:

class Drivetrain extends HardwareResource {
  void driveDistance(Measure<Distance> distance) {
    encoders.reset();
    while (getDistance().lt(distance)) {
      AsyncCommand.pause();
      tankDrive(1, 1);
    }
    stop();
  }

  AsyncCommand driveDistanceCommand(Measure<Distance> distance) {
    return run(() -> driveDistance(distance)).named("Drive " + distance.in(Feet) + " Feet");
  }
}

However, an async command will need to be actively cooperative and periodically call blocking operations for the virtual thread that executes it to be paused and allow another vthread to be scheduled. This can be accomplished by calling AsyncCommand.pause() in every loop run, which just calls Thread.sleep(20) under the hood to pause the vthread for 20 milliseconds. Calling AsyncCommand.pause(Measure) allows for custom pause times, such as 5 or 10 milliseconds for commands that require tighter loops.

Operations that pause the current thread (such as calling .get on a future or acquiring a read/write lock) will allow for a vthread to be parked and another vthread to be continued. synchronized blocks and methods do not pause the vthread, and will instead block the native carrier thread, and thus all commands and triggers as well. In addition, virtual threads cannot be paused while executing in native code.

There are also some other additions like priority levels (as opposed to a blanket yes/no for ignoring incoming commands), factories requiring names be provided for commands, and the scheduler tracking all running commands and not just the highest-level groups. However, those changes aren't unique to an async framework, and could just as easily be used in a traditional command framework.

Links:
JEP 425 - Virtual Threads
JEP 453 - Structured Concurrency
The async version of the rapid react example project
AsyncScheduler.java
ParallelGroup.java
Sequence.java

@calcmogul
Copy link
Member

calcmogul commented Apr 17, 2024

We've discussed putting a command-based rewrite in its own package instead of wpilibjN. For example, edu.wpi.first.commands3. edu.wpi.first.wpilibj2 created a bunch of confusion for which classes were where.

@SamCarlberg
Copy link
Member Author

We've discussed putting a command-based rewrite in its own package instead of wpilibjN. For example, edu.wpi.first.commands3. edu.wpi.first.wpilibj2 created a bunch of confusion for which classes were where.

It's too early to be bikeshedding 😄 My goal here is to inform future command framework changes, not to have a ready-to-go implementation. (Though I do agree that wpilibjN is a bad naming scheme; I'd personally just put these in a .wpilibj.command.async package and be done with it)

@Starlight220
Copy link
Member

Any reason for pause over suspend or yield?
The latter is preferable imo

@SamCarlberg
Copy link
Member Author

Any reason for pause over suspend or yield?

It's clearer about what it does without needing to assume a baseline knowledge of multithreading concepts. And it's apt if async functions/continuations/coroutines (or wherever else you'd like to call them) are taught as pausable functions

@Starlight220
Copy link
Member

Starlight220 commented Apr 17, 2024

And it's apt if async functions/continuations/coroutines (or wherever else you'd like to call them) are taught as pausable functions

I strongly disagree that "pausable functions" is the term that should be used -- it's not a term that is commonly used and will likely cause confusion.
A quick google search for "pausable functions" brings no relevant info, whereas "yield function" and "suspend function" bring relevant info (mainly centered around python and kotlin coroutines respectively).

Therefore I maintain my suggestion to name it yield -- it'll be immediately recognized as akin to Python's yield (by users with Python experience, and if students learn Python later).
My rationale for preferring yield over suspend is that Python's concurrency model is more commonly known and is more semantically similar to this than Kotlin's.

Command-based historically has a problem of being very DSL-y and magic incantations; this is a good opportunity to take a big step to using common industry terms. Let's take that step.

@shueja
Copy link
Contributor

shueja commented Apr 17, 2024

What happens if AsyncCommand.pause() is called from outside an AsyncCommand virtual thread?

@Starlight220
Copy link
Member

With the current impl, it would suspend the entire carrier thread. Throwing could be better.

@Oblarg
Copy link
Contributor

Oblarg commented Apr 17, 2024

This seems like it could also support differing command interruption semantics in addition to priorities. If we have both of those features, we can very naturally represent "default commands" as a minimum-priority command with suspend or reschedule behavior on interruption.

@SamCarlberg
Copy link
Member Author

Therefore I maintain my suggestion to name it yield

This is a good argument. I'll note that virtual thread pausing does not behave like yield in python or kotlin (or ruby), which allows data to be passed to an invoking function. Java also has a context-specific yield keyword (similar to var), which allows branches in switch expressions to return values to the caller. I'm a little concerned about overloading the meaning of yield with what's built into the language, but most users wouldn't already be used to it because it was added in Java 13. A yield method also cannot be called directly, it must be qualified; just yield() will not compile, but this.yield() or AsyncCommand.yield() would.

What happens if AsyncCommand.pause() is called from outside an AsyncCommand virtual thread?

As written, it would just block the calling thread. Yotam's suggestion of throwing would certainly be better, since it makes no sense to pause outside a command.

This seems like it could also support differing command interruption semantics in addition to priorities. If we have both of those features, we can very naturally represent "default commands" as a minimum-priority command with suspend or reschedule behavior on interruption.

I like this idea. How would a resume-after-interruption command look in user code?

@Oblarg
Copy link
Contributor

Oblarg commented Apr 18, 2024

I think we should re-purpose the InterruptBehavior enum, since its prior purpose is entirely superseded by priorities. I also think we should remove withInterruptBehavior and replace it with onInterrupt overloads.

So, it'd look like:

run(...).onInterrupt(InterruptBehavior.kSuspend);
run(...).onInterrupt(() -> { ... }, InterruptBehavior.kSuspend);

Suspended commands will be re-scheduled as soon as their requirements become available. The implementation that allows this could also allow us to eventually support queued commands.

@SamCarlberg
Copy link
Member Author

I think we should re-purpose the InterruptBehavior enum, since its prior purpose is entirely superseded by priorities. I also think we should remove withInterruptBehavior and replace it with onInterrupt overloads.

Reminder that commands are interrupted by interrupting their vthread, which triggers an InterruptedException in any blocking calls in the command (Thread.sleep, Object.wait, Future.get, Lock.lock, etc), and immediately exits the run method call. I don't think we can get access to the underlying continuation layer to have it keep a vthread on ice.

Maybe we could have a pause() method on the scheduler that does a busywait? This would keep it paused internally as long as higher-priority commands are running, then automatically pick back up when it becomes the highest-priority command remaining.

This is not a complete and correct implementation, but for the sake of example:

// AsyncScheduler.java
public void pause(AsyncCommand command, Measure<Time> time) {
  long ms = (long) time.in(Milliseconds);
  boolean suspendOnInterrupt = command.interruptBehavior() == kSuspend;

  do {
    try {
      Thread.sleep(ms);
    } catch (InterruptedException e) {
      if (suspendOnInterrupt)
        continue;
      throw e;
    }
  } while (command.priority() < getHighestPriorityCommandUsingAnyOf(command.requirements()).priority());
}

@TheTripleV
Copy link
Member

This is very similar syntax to my python coroutine command impl.

encoders.reset()
while getDistance() < distance:
  yield
  tankDrive(1, 1)
stop()

It does active cooperation with yield.

I've been mulling over doing yield based coroutines vs async/await based coroutines throughout wpilib and I think there needs to be more exploring do on which is better long-term (idk if java has async/await keywords incoming but it would be nice if the python/java apis were similar).

@SamCarlberg
Copy link
Member Author

Java isn't getting async/await keywords, the best we have is Futures (more or less analogous to JavaScript promises). That's error prone since it's totally valid to call a method that returns a Future and then not await its result. A yield-based API seems less prone to unintended behavior

Track all running nested commands, not just top-level

Allows checking for any running command and awaiting any nested command
Incomplete implementation; breaks when a command interrupts a suspendable command of the same priority

Add a standard unit for nanoseconds to support sub-millisecond pause precision
Code is more accessible when using methods directly instead of commands

Significantly less complexity and risk of ordering issues

Slightly risker since requirements need to be handled manually
Async command errors tend to be nested 3 or 4 deep
int ns = (int) (time.in(Nanoseconds) % 1e6);

// Always sleep once
Thread.sleep(ms, ns);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be waiting for 20ms - execution time so it resumes at a 20ms clock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Though the underlying thread jitter tends to make this moot, where the time spent parked can be off by several milliseconds. Worst I've seen is a request for 20ms actually sleep for 36... That might get improved with optimizations to reduce the number of locking calls, or maybe it just ran afoul of a stop-the-world GC pass

Because continuations are hidden in an internal JDK package, we need to wrap them in our own classes. Requires force-opening the JDK modules to our own (unnamed) module.
@SamCarlberg
Copy link
Member Author

Virtual threads ended up being too prone to deadlocking and inconsistent timing, since it's at the mercy of the JVM's thread scheduler for when parked vthreads are resumed. I was consistently seeing sleep times of 20ms end up taking ~25ms, with the worst outlier at ~38ms.

I've pushed changes using Java's Continuation class that's used internally for pause/resume funcionality of virtual threads. Unfortunately, it's not a public API, so I need to do some build hacks with --add-opens to make it accessible. It also means Thread.sleep is back to being a bad decision, since it'll sleep the main execution thread, which also means no more per-command loop frequencies. But it otherwise allows for much simpler and less error-prone schedule code. Commands just need to use AsyncCommand.yield() instead of the old AsyncCommand.pause(); timing can be done with a timer, or use the withTimeout decorator

AsyncCommand cmd = someResource.run(() -> {
  var timer = new Timer();
  timer.start();
  while (!timer.hasElapsed(5.000)) {
    AsyncCommand.yield();
    doSomething();
  }
}).named("Demo");

One potential issue here is that there's no way to run cleanup logic on command interrupts. Continuations will just be kicked out of the scheduler if their command is canceled.

Rethrow runtime exceptions thrown by commands instead of wrapping them

Add some documentation
Allows command groups and compositions to reuse logic
Reframe default commands as suspendable minimum-priority commands
Ther permits the use of commands to control a subsystem comprised of two separate mechanisms that need to be controlled together (eg a pink arm) that would traditionally need to be controlled without using commands to avoid scenarios in which one mechanism is controlled and the other is not
Scheduler now pulls the default command from resources, instead of keeping track internally
Bump Java to 21.0.2 due to compiler bug in 21.0.1
Move examples to "coroutine" package instead of "async"
Helpful for timed sequences without needing to wrap async methods in a command
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants