Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentModificationException while unregistering plugin in onDisable #494

Open
opl- opened this issue Oct 6, 2023 · 10 comments · May be fixed by #501
Open

ConcurrentModificationException while unregistering plugin in onDisable #494

opl- opened this issue Oct 6, 2023 · 10 comments · May be fixed by #501
Assignees
Labels
bug Something isn't working

Comments

@opl-
Copy link

opl- commented Oct 6, 2023

CommandAPI version

9.2.0

Minecraft version

1.19.4

Running Paper git-Paper-524 (MC: 1.19.4) (Git: 9846d0d)

Are you shading the CommandAPI?

Yes

What I did

Using the below test plugin which shades/shadows CommandAPI, and a plugin which allows unloading individual plugins (such as PlugmanX or ServerUtils), start the server as normal, then unload our test plugin.

@Override
public void onLoad() {
    CommandAPI.onLoad(
        CommandAPIBukkitConfig(this)
            .silentLogs(false)
            .verboseOutput(true)
    );
}

@Override
public void onEnable() {
    CommandAPI.onEnable();

    new CommandAPICommand("stuff")
        // This is where the command would go if I could find a case where it reliably breaks.
        .register();
}

@Override
public void onDisable() {
    CommandAPI.unregister("stuff");
    CommandAPI.onDisable();
}

What actually happened

The command was unregistered, but doing it left the following stack trace in the server logs:

[22:54:00 INFO]: [CommandAPI] Unregistering command /stuff
[22:54:00 ERROR]: Caught previously unhandled exception :
[22:54:00 ERROR]: Paper Async Command Builder Thread Pool - 0
java.util.ConcurrentModificationException: null
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown Source) ~[?:?]
        at java.util.TreeMap$ValueIterator.next(Unknown Source) ~[?:?]
        at net.minecraft.commands.Commands.fillUsableCommands(Commands.java:475) ~[?:?]
        at net.minecraft.commands.Commands.sendAsync(Commands.java:441) ~[?:?]
        at net.minecraft.commands.Commands.lambda$sendCommands$5(Commands.java:414) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
        at java.lang.Thread.run(Unknown Source) ~[?:?]

Unfortunately, I've been unable to reliably reproduce it once I started. Maybe it's a race condition? That'd be fun!

What should have happened

Command /stuff should be unregistered without an exception.

Server logs and CommandAPI config

No response

Other

No response

@opl- opl- added the bug Something isn't working label Oct 6, 2023
@willkroboth willkroboth self-assigned this Oct 6, 2023
@willkroboth
Copy link
Collaborator

Hm, yeah, I'm having trouble reproducing this.

I was wondering, did you have any other plugins on the server when you first encountered this issue? Maybe there's another factor at play here. If you still have a full server log showing the issue, that could also be useful.

@opl-
Copy link
Author

opl- commented Oct 11, 2023

I suspect this is indeed a race condition after Jorel pointed out that Paper offloads constructing the updated command tree to send to players off the main thread, meaning the ideal conditions for a reproduction would probably be repeatedly registering and unregistering a somewhat complex command.

There are indeed other plugins installed, which once again might be making it easier to reproduce since the other thread will take longer to finish constructing the command tree.

@opl-
Copy link
Author

opl- commented Oct 13, 2023

I decided to mess around with it a little more and discovered that I can very reliably trigger it through some weird means.

I removed all plugins other than:

  • PlugmanX,
  • the two plugins which use the shaded CommandAPI; the game plugin depends on the framework plugin,
  • and FastAsyncWorldEdit (which the framework plugin depends on).

Both of my plugins register and unregister commands with the CommandAPI instance shaded into the framework plugin.

In fact, I managed to boil it down to commands as simple as this:

commandTree("test1") {
	anyExecutor { sender, _ ->
		sender.sendMessage("a")
	}
}

I can reliably reproduce this issue by first unloading the game plugin (the dependent), then the framework. I once waited 20 seconds between unloading the game plugin and the framework plugin: the exception was still thrown.

What's even more interesting, the issue doesn't happen if the framework plugin doesn't register a command.

@willkroboth
Copy link
Collaborator

Are you able to share a server log or the plugins you are using? That could help me replicate it to speed up investigation. If those are private, maybe you could make a simple replication plugin?

The 20-second thing and the simple tree make it seem this isn't a race condition. Probably some faulty logic on Paper's side, or maybe something weird with using a version of the CommandAPI shaded in another plugin?

@opl-
Copy link
Author

opl- commented Oct 15, 2023

Nothing useful in the server logs other than what I already mentioned. Could get you one with verboseOutput enabled, but that's about it.

Plugins are indeed private, and I unfortunately don't have much time this week.

@willkroboth
Copy link
Collaborator

I was able to replicate this 🎉

I had a master plugin that shades the CommandAPI:

public class Main extends JavaPlugin {
    @Override
    public void onLoad() {
        CommandAPI.onLoad(
                new CommandAPIBukkitConfig(this)
                        .dispatcherFile(new File(getDataFolder(), "commands.json"))
                        .verboseOutput(true)
        );
    }

    @Override
    public void onEnable() {
        CommandAPI.onEnable();

        new CommandAPICommand("host")
                .executes((sender, args) -> {
                    sender.sendMessage("This is the host command");
                })
                .register();
    }

    @Override
    public void onDisable() {
        CommandAPI.unregister("host", true);
        CommandAPI.onDisable();
    }
}

CommandAPITest-1.0-SNAPSHOT.jar.zip

And a SubPlugin that uses the CommandAPI via the main plugin:

public final class SubPlugin extends JavaPlugin {

    @Override
    public void onEnable() {
        new CommandAPICommand("sub")
                .executes((sender, args) -> {
                    sender.sendMessage("This is the sub command");
                })
                .register();
    }

    @Override
    public void onDisable() {
        CommandAPI.unregister("sub", true);
    }
}

SubPlugin-1.0-SNAPSHOT.jar.zip

These plugins can be built using this maven project: CommandAPITest.zip.

To replicate:

  • Add CommandAPITest, SubPlugin, and PlugmanX to the server
  • Get a player on the server
  • Unload SubPlugin
  • Unload CommandAPITest

The exception will be thrown:

java.util.ConcurrentModificationException: null
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1486) ~[?:?]
        at java.util.TreeMap$ValueIterator.next(TreeMap.java:1531) ~[?:?]
        at net.minecraft.commands.Commands.fillUsableCommands(Commands.java:475) ~[?:?]
        at net.minecraft.commands.Commands.sendAsync(Commands.java:435) ~[?:?]
        at net.minecraft.commands.Commands.lambda$sendCommands$5(Commands.java:414) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]

Here's my server log for reference: latest.log.

I should be able to run a debugger now and get a better idea of what's going on here.

@willkroboth
Copy link
Collaborator

Okay, I think I figured out what's happening. It's a bit tricky to see since tasks are being run at the same time, but I added some extra debugging messages and got this log:

> plugman unload CommandAPITest
[13:26:13 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Creating command build task for literal{Player495}
[13:26:13 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Creating command build task for literal{Player495}
[13:26:13 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Running command build task for literal{Player495}
[13:26:13 INFO]: [CommandAPITest] Disabling CommandAPITest v1.0.0
[13:26:13 INFO]: [CommandAPI] Unregistering command /host
[13:26:13 INFO]: [CommandAPI] Removing host from vanilla commands
[13:26:13 ERROR]: Caught previously unhandled exception :
[13:26:13 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Running command build task for literal{Player495}
[13:26:13 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Creating command send task for literal{Player495}
[13:26:13 ERROR]: Paper Async Command Builder Thread Pool - 0
java.util.ConcurrentModificationException: null
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1486) ~[?:?]
        at java.util.TreeMap$ValueIterator.next(TreeMap.java:1531) ~[?:?]
        at net.minecraft.commands.Commands.fillUsableCommands(Commands.java:488) ~[?:?]
        at net.minecraft.commands.Commands.sendAsync(Commands.java:445) ~[?:?]
        at net.minecraft.commands.Commands.lambda$sendCommands$5(Commands.java:423) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
[13:26:14 INFO]: [CommandAPI] Removing host from Bukkit map
[13:26:14 INFO]: [CommandAPI] Removing host from resources dispatcher
[13:26:14 INFO]: [CommandAPI] Informing Player495 that host has been removed
[13:26:14 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Creating command build task for literal{Player495}
[13:26:14 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Running command build task for literal{Player495}
[13:26:14 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Creating command send task for literal{Player495}
[13:26:15 INFO]: [PlugMan] CommandAPITest has been unloaded.
[13:26:15 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Sending commands to literal{Player495}
[13:26:15 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Sending commands packet to literal{Player495}
[13:26:15 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Sending commands to literal{Player495}
[13:26:15 INFO]: [STDOUT] [net.minecraft.commands.CommandDispatcher] Sending commands packet to literal{Player495}

Plugman is pretty smart and tries to unload commands itself before disabling a plugin. So, immediately after running the unload command but before CommandAPITest is unloaded, Plugman unloads commands and tells Paper to resend commands (For some reason it creates 2 Command build tasks ¯\(ツ)/¯). So, in another thread, Paper starts running command build task.

Then, Plugman unloads CommandAPITest. CommandAPITest unregisters its command, modifying the vanillaCommandDispatcher. However, Paper is currently using that to build its packet, so you get a ConcurrentModificationException.


While testing this, I noticed that it's a bit random whether or not unloading CommandAPITest actually triggers the exception. On occasion, unloading SubPlugin can also throw the exception. I guess it just really depends on what Paper's Async Command Builder Thread Pool is up to at the moment. It's weird that I couldn't reproduce this until trying the 2-plugin setup, but whatever.

Good news, I don't think this exception has any bad side effects. The CommandAPI's final command update should always succeed, so the client should see the correct set of commands. The server's command tree also shouldn't get messed up; there's just a problem reading it sometimes. So, it should be fine to keep running the server if you see this error.

This problem is an unhappy accident between Paper, Plugman, and the CommandAPI, so it's really everyone's fault. It might be worth telling Paper about this, but I'd hold off until it can be replicated using just their API methods. I imagine it might be possible that this is really Paper's fault for being thread-unsafe, but both Plugman and the CommandAPI use reflection to unregister commands, so messy behavior is expected.

The CommandAPI should be able to fix this bug by making sure to not unregister commands while Paper's Async Command Builder Thread Pool is doing its stuff. I'm not exactly sure how to make that happen, but it should be possible.

@willkroboth
Copy link
Collaborator

I reproduced this with a single plugin that registers commands:

public class Main extends JavaPlugin implements Listener {
    @Override
    public void onLoad() {
        CommandAPI.onLoad(
                new CommandAPIBukkitConfig(this)
                        .dispatcherFile(new File(getDataFolder(), "commands.json"))
                        .verboseOutput(true)
        );
    }

    @Override
    public void onEnable() {
        CommandAPI.onEnable();

        getServer().getPluginManager().registerEvents(this, this);
    }

    @EventHandler
    public void onPlayerJoin(PlayerJoinEvent event) {
        Player player = event.getPlayer();

        for (int i = 0; i < 100; i++) {
            player.updateCommands();
        }

        for (int i = 0; i < 100; i++) {
            int finalI = i;
            new CommandAPICommand("command" + i)
                    .executes(info -> {
                        info.sender().sendMessage("You ran command #" + finalI);
                    })
                    .register();
        }
    }

    @Override
    public void onDisable() {
        CommandAPI.onDisable();
    }
}

I just absolutely spammed Player#updateCommands and CommandAPICommand#register to make it more likely to error. Here's my log for this.

So, yeah. This doesn't seem to be a problem with multiple plugins or unregistering specifically. Just any time the CommandAPI modifies the commands, it is possible that Paper might be building a Commands packet at the same time, causing a ConcurrentModificationException. You don't need unloading with Plugman or unregistering, which are less stable actions.

Unfortunately, I don't think it is possible to reproduce this with just Paper API, so this is a fully CommandAPI problem. It might be possible with PaperMC/Paper#8235, which I think exposes the Brigadier tree, but that's not really helpful here.

@opl-
Copy link
Author

opl- commented Oct 18, 2023

Great job working out the details!

I tried to reproduce this issue using a method similar to what you're doing in your second comment, but couldn't, even with thousands of commands. I'm guessing it's because i was unregistering the created commands in onDisable? That's the main difference between your attempt and mine. (Edit: I just noticed you're also calling player.updateCommands() which I didn't even think to do.)

For a dirty workaround, you could block until the thread is done by joining to it, assuming it exits at some point. I'll try doing that in my onDisable - hopefully it won't result in a deadlock. It's probably not a great solution since it could just cause even more issues down the line. (I'll attempt this one later for a short-term workaround.)

Alternatively, you might be able to use reflection to replace whatever instance is being iterated over with a thread-safe equivalent, for example by wrapping the existing instance with Collections.synchronizedMap or just fully replacing it with whatever type won't make the JVM complain.

At the end of the day, I think this probably counts as a Paper bug, unless they're already ensuring thread-safety at a slightly higher level and this behavior isn't a poor design, but rather an intended design.

@willkroboth
Copy link
Collaborator

Yeah, player.updateCommands() is important because that triggers Paper to start its async task. Any time the commands are updated (by using the CommandAPI to register or unregister a command) if Paper happens to be looping through the commands at the same time, a ConcurrentModificationException can occur.

I currently think this is not Paper's fault, since they don't provide any API to update the Brigadier commands. The CommandAPI has to use reflection and NMS code, so it's our fault for 'poking our nose' where we shouldn't.

I think blocking the thread until Paper is done building commands is the best solution. The 'collection' being iterated over is a Brigadier RootCommandNode and all its children, so it'd be really hard to ensure all the nodes are using thread-safe collections.

Unfortunately, I don't know how to monitor Paper's command building to block the thread. Paper is using a ThreadPoolExecutor to manage the command-building tasks:

public static final java.util.concurrent.ThreadPoolExecutor COMMAND_SENDING_POOL = new java.util.concurrent.ThreadPoolExecutor(
    0, 2, 60L, java.util.concurrent.TimeUnit.SECONDS,
    new java.util.concurrent.LinkedBlockingQueue<>(),
    new com.google.common.util.concurrent.ThreadFactoryBuilder()
        .setNameFormat("Paper Async Command Builder Thread Pool - %1$d")
        .setUncaughtExceptionHandler(new net.minecraft.DefaultUncaughtExceptionHandlerWithName(net.minecraft.server.MinecraftServer.LOGGER))
        .build(),
    new java.util.concurrent.ThreadPoolExecutor.DiscardPolicy()
);

Ideally, we'd want to wait until there are not any jobs being processed to do our registration/unregistration. It should be easy enough to use reflection/NMS to access this to monitor its state. However, I tried looking through all methods for ThreadPoolExecutor, and I couldn't find any way to do this... I am very much not familiar with multithreading though, so there could be an easy way to do this ¯\(ツ)/¯.

willkroboth added a commit that referenced this issue Dec 19, 2023
Use some sneaky reflection to intercept `net.minecraft.server.CommandDispatcher`'s task scheduling and enforce our own read-write access

Updates #501
Fixes #494 and #503
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants