Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate and fix bad I/O performance #558

Open
JohannesLichtenberger opened this issue Nov 16, 2022 · 28 comments
Open

Investigate and fix bad I/O performance #558

JohannesLichtenberger opened this issue Nov 16, 2022 · 28 comments

Comments

@JohannesLichtenberger
Copy link
Member

Write speed is only about 10MB/sec, which is rather low and doesn't saturate modern SSDs at all (not even HDDs) and SirixDB should be designed to take advantage of modern flash drive characteristics.

@JohannesLichtenberger
Copy link
Member Author

Main issue can for instance be seen in JsonShredderTest::testChicago with the following file: https://www.file-upload.net/download-15038251/cityofchicago.zip.html

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Nov 16, 2022

Small part of the vmstat -1 output while running the test via Gradle in IntelliJ:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0  14592 7167968 1042708 10715320    0    0     0     0 12989 59878 53  4 43  0  0
 1  0  14592 6860564 1042728 10717444    0    0     8 33008 2955 7607 20  3 77  1  0
 1  0  14592 6656196 1042736 10717436    0    0     0    52 1951 2653 14  2 84  0  0
 2  0  14592 6608824 1042736 10737152    0    0     0     0 11316 45129 49  3 48  0  0
 1  0  14592 6624540 1042752 10750372    0    0     0 33456 3326 19335 21  2 76  1  0
 1  0  14592 6650496 1042752 10750308    0    0     0    80 2329 4072 24  1 76  0  0
 4  0  14592 6613548 1042752 10761696    0    0     0   196 9845 32601 62  2 36  0  0
 1  0  14592 6669560 1042768 10783128    0    0     0 33044 4930 29539 38  3 58  0  0
 1  0  14592 6675864 1042768 10783128    0    0     0     0 3311 6018 29  1 70  0  0
 9  0  14592 6215208 1042768 10783128    0    0     0    12 4164 6709 19  2 79  0  0
 2  0  14592 6142636 1042788 10816328    0    0     0 33072 11459 60453 36  3 60  1  0
 1  0  14592 6185552 1042788 10816328    0    0     0    12 2854 4942 18  1 80  0  0
11  0  14592 6183788 1042796 10816328    0    0     0   220 4007 6843 33  1 66  0  0
 2  0  14592 6092816 1042812 10849172    0    0     0 32968 11970 60387 40  3 56  1  0

@JohannesLichtenberger JohannesLichtenberger changed the title Investivate and fix bad I/O performance Investigate and fix bad I/O performance Nov 16, 2022
@JohannesLichtenberger
Copy link
Member Author

@ikorennoy here's the bug regarding the bad I/O performance.

@JohannesLichtenberger
Copy link
Member Author

Screenshot from 2022-11-16 20-45-14
Screenshot from 2022-11-16 17-38-09

@JohannesLichtenberger
Copy link
Member Author

YourKit Snapshots: gradle-worker-classpath.zip

@JohannesLichtenberger
Copy link
Member Author

Screenshot from 2022-11-16 20-49-58

@JohannesLichtenberger
Copy link
Member Author

Screenshot from 2022-11-16 20-51-24

@JohannesLichtenberger
Copy link
Member Author

Screenshot from 2022-11-16 23-02-01

Screenshot from 2022-11-16 23-10-48

I believe GSON is already slow to parse the input file. Maybe it's the FileReader as it's the only instance that uses a FileInputStream!?

  /**
   * Create a new {@link JsonReader} instance on a file.
   *
   * @param path the path to the file
   * @return an {@link JsonReader} instance
   */
  public static JsonReader createFileReader(final Path path) {
    checkNotNull(path);

    try {
      final var fileReader = new FileReader(path.toFile());
      final var jsonReader = new JsonReader(fileReader);
      jsonReader.setLenient(true);
      return jsonReader;
    } catch (final FileNotFoundException e) {
      throw new UncheckedIOException(e);
    }
  }

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Nov 16, 2022

It might only be slow because of the Sirix code inserting nodes...

  @Test
  public void testGson() throws IOException {
    final var jsonPath = JSON.resolve("cityofchicago.json");
    final var reader = JsonShredder.createFileReader(jsonPath);

    // Iterate over all nodes.
    while (reader.peek() != JsonToken.END_DOCUMENT) {
      final var nextToken = reader.peek();

      switch (nextToken) {
        case BEGIN_OBJECT -> reader.beginObject();
        case NAME -> reader.nextName();
        case END_OBJECT -> reader.endObject();
        case BEGIN_ARRAY -> reader.beginArray();
        case END_ARRAY -> reader.endArray();
        case STRING, NUMBER -> reader.nextString();
        case BOOLEAN -> reader.nextBoolean();
        case NULL -> reader.nextNull();
        default -> {
        }
      }
    }
  }

This reads and parses the file in approx 20 seconds on my machine, which is approx 190 mb/sec, which is okay (not particularly fast, but still), I guess.

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Nov 21, 2022

We've made some success over here (the traversal of the chicago file in preorder now is around 1:20min on my machine) and depending on if a path summary or not is created insertion time is approximately 2:20min - 2:50min. We might still have the potential for improvements, though. Reading in preorder was around 2:30min before and import was around 4-4:30mins on my machine.

We also should investigate the performance of the coroutine-based descendant axis (which prefetches right siblings during down-traversal in preorder).

@JohannesLichtenberger
Copy link
Member Author

A JFR recording with async profiler (IntelliJ Ultimate) inserting the chicago file 5 times in a row (agent options: event=cpu,alloc,lock,cache-misses,interval=1ms,jfrsync=profile)

https://www.file-upload.net/download-15045950/JsonShredderTest_testChicago_2022_11_23_211121.tar.xz.html

@JohannesLichtenberger
Copy link
Member Author

A sweet spot seems to be to flush a new revision to disk every 262_144 << 3 nodes. Currently we're approx on average at 2:35min for the import with a path summary with several runs.

@JohannesLichtenberger
Copy link
Member Author

With wall clock time enabled (compression seems to be a huge factor):
JsonShredderTest_testChicago_2023_01_03_170949.html.tar.gz

@JohannesLichtenberger
Copy link
Member Author

We currently face a memory leak...

@Kiriakos1998
Copy link

Hello, @JohannesLichtenberger I would like to help. Is the issue still as the title describes or the performance has been fixed and now the memory leak is the issue?

@JohannesLichtenberger
Copy link
Member Author

I think the main issue was, that the page cache max sizes were too high and thus allocation rate of objects was higher than GC could cleanup. Especially with ZGC as it's not generational yet I've had bad pause times somehow... Shenandoah was better, but also not generational yet.

What's strange is that the io uring implementation is slower than the FileChannel based version. Maybe the event loop is an issue somehow...

@JohannesLichtenberger
Copy link
Member Author

@Kiriakos1998, do you like to work on this?

#536

Also, this might be interesting: #609

We'd also need JMH tests to better measure performance regarding JSON data... (import/traverse/query)...

@Kiriakos1998
Copy link

Sure I can take a look.

@JohannesLichtenberger
Copy link
Member Author

@Kiriakos1998 did you already have had some time to look into the issue(s)?

@Kiriakos1998
Copy link

@Kiriakos1998 did you already have had some time to look into the issue(s)?

Yes, however although I managed to find the index of the iteration(3 or 4) that skyrockets execution time, I was not able to find the exact line of code which is a little bit strange. I can provide with more details if it's needed.

@Kiriakos1998
Copy link

@JohannesLichtenberger To be more specific. As I already mentioned one iteration skyrockets the execution time meaning that the specific iteration is responsible for more than 95% of execution time but when trying to isolate the execution time of each code line in the loop none of them reports any significant execution time.

@JohannesLichtenberger
Copy link
Member Author

Maybe it's when reading from the producer / channel kicks in!?

@JohannesLichtenberger
Copy link
Member Author

@Kiriakos1998 would you also like to have a look regarding the io_uring backend?

@Kiriakos1998
Copy link

@Kiriakos1998 would you also like to have a look regarding the io_uring backend?

Sure. Regarding this I need a Linux environment correct?

@JohannesLichtenberger
Copy link
Member Author

Yes, correct. Then you can try to run JsonShredder::testChicago for instance and you need the 3,8 Gb test file and copy it into the right place in src/test/resources/json I think

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Jun 8, 2023

You may then use a debugger. For IntelliJ there's a plugin from Johannes Bechberger. If you like to keep on working on SirixDB I think I have one or two licenses left for IntelliJ Ultimate and the other stuff from JetBrains (but should be only used for SirixDB/Brackit as we got the licenses for free).

@Kiriakos1998
Copy link

I have Intellij Ultimate but I think it will be hard for me to work on this issue. I can create a virtual machine for Linux environment ( my pc has windows) but I think I won't be able to install Intellij there. At least the ultimate version. Do you think it's absolutely necessary to investigate this issue?

@JohannesLichtenberger
Copy link
Member Author

I may take a look myself again next week :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants