LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader #179

gautamworah96 · 2021-06-10T23:47:21Z

Description

In LUCENE-9450 we switched the Taxonomy index from Stored Fields to BinaryDocValues. In the resulting implementation of the getPath code, we create a new BinaryDocValues's values instance for each ordinal.
It may happen that we may traverse over the same nodes over and over again if the getPath API is called multiple times for ordinals in the same segment/with the same readerIndex.

This PR takes advantage of that fact by sorting ordinals and then trying to find out if some of the ordinals are present in the same segment/have the same readerIndex (by trying to advanceExact to the correct position and not failing) thereby allowing us to reuse the previous BinaryDocValues object.

This PR was originally proposed in the older lucene-solr repo here. I discontinued it because we could not find a good "real" use case for it.

This new PR uses the getBulkPath API inside IntTaxonomyFacetCounts and FloatTaxonomyFacetCounts classes. In the getTopChildren method within these classes, we usually collect all the top-n ordinals first and then translate them into FacetLabels one by one using an iterative API. This call has been replaced with the getBulkPath method in this PR.

Solution

Steps:

Sort all ordinals and remember their position so as to store the path in the correct position
Try to advanceExact to the correct position with the previously calculated readerIndex. If the operation fails, try to find the correct segment for the ordinal and then advanceExact to the desired position.
Store this position for future ordinals.

Tests

Added a new randomized test for the API that compares the individual getPath results from ordinals with the bulk FacetLabels returned by the getBulkPath API
Added a backwards compatibility test for the API that reads an older StoredFields based index

Benchmarks:

The stock luceneutil taxonomy tasks call getTopChildren(10, ..) for testing the performance of taxonomy facet counting classes. Since FastTaxonomyFacetCounts inherits from IntTaxonomyFacetCounts (and uses the same getTopChildren logic) the user should be able to reap the benefits of the bulk API directly.

luceneutil that tests the top 10 children:

                   TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                  Fuzzy1       44.07      (6.0%)       42.44      (7.2%)   -3.7% ( -16% -   10%) 0.078
       HighTermMonthSort       22.69     (12.1%)       21.99     (11.3%)   -3.1% ( -23% -   23%) 0.402
    HighTermTitleBDVSort       19.59     (12.4%)       18.99     (12.4%)   -3.1% ( -24% -   24%) 0.434
   HighTermDayOfYearSort       54.33     (15.7%)       53.11     (16.3%)   -2.3% ( -29% -   35%) 0.656
              TermDTSort       15.58      (9.1%)       15.24     (10.9%)   -2.2% ( -20% -   19%) 0.488
                HighTerm     1010.56      (4.0%)      988.75      (5.2%)   -2.2% ( -10% -    7%) 0.142
           OrHighNotHigh      422.64      (4.4%)      415.30      (5.6%)   -1.7% ( -11% -    8%) 0.274
                 MedTerm      734.85      (5.5%)      722.45      (4.8%)   -1.7% ( -11% -    9%) 0.301
                  Fuzzy2       14.51      (3.6%)       14.39      (2.9%)   -0.9% (  -7% -    5%) 0.399
             AndHighHigh       18.22      (3.4%)       18.07      (2.6%)   -0.8% (  -6% -    5%) 0.388
              AndHighLow      181.77      (1.7%)      180.51      (2.3%)   -0.7% (  -4% -    3%) 0.277
                 Respell       30.50      (2.0%)       30.36      (1.7%)   -0.5% (  -4% -    3%) 0.438
            OrNotHighLow      347.94      (4.7%)      346.46      (3.6%)   -0.4% (  -8% -    8%) 0.747
             LowSpanNear        3.83      (1.9%)        3.82      (2.2%)   -0.3% (  -4% -    3%) 0.626
              HighPhrase       39.95      (7.0%)       39.85      (7.3%)   -0.3% ( -13% -   15%) 0.909
            OrHighNotMed      405.04      (4.9%)      404.00      (5.4%)   -0.3% ( -10% -   10%) 0.874
               OrHighLow      110.15      (3.0%)      109.87      (3.2%)   -0.2% (  -6% -    6%) 0.800
               OrHighMed       61.11      (3.1%)       60.96      (2.9%)   -0.2% (  -6% -    5%) 0.795
         MedSloppyPhrase        5.47      (6.0%)        5.46      (3.1%)   -0.2% (  -8% -    9%) 0.890
             MedSpanNear       12.25      (1.6%)       12.24      (2.0%)   -0.1% (  -3% -    3%) 0.887
                  IntNRQ       25.39      (1.3%)       25.37      (2.0%)   -0.1% (  -3% -    3%) 0.880
              AndHighMed       64.96      (3.1%)       64.95      (2.7%)   -0.0% (  -5% -    6%) 0.977
            HighSpanNear       17.37      (2.0%)       17.38      (3.7%)    0.0% (  -5% -    5%) 0.963
         LowSloppyPhrase       55.39      (4.0%)       55.42      (2.6%)    0.1% (  -6% -    6%) 0.958
              OrHighHigh        9.85      (2.1%)        9.86      (2.9%)    0.1% (  -4% -    5%) 0.918
                Wildcard       29.06      (4.5%)       29.08      (3.4%)    0.1% (  -7% -    8%) 0.942
    HighIntervalsOrdered        7.97      (2.5%)        7.98      (3.1%)    0.1% (  -5% -    5%) 0.907
                 LowTerm      883.85      (4.4%)      884.97      (6.1%)    0.1% (  -9% -   11%) 0.940
                 Prefix3       40.49      (7.1%)       40.55      (5.2%)    0.2% ( -11% -   13%) 0.939
            OrHighNotLow      440.52      (5.1%)      441.49      (4.8%)    0.2% (  -9% -   10%) 0.888
               MedPhrase      138.15      (4.3%)      138.51      (4.7%)    0.3% (  -8% -    9%) 0.855
               LowPhrase      111.07      (2.9%)      111.40      (3.0%)    0.3% (  -5% -    6%) 0.751
            OrNotHighMed      375.26      (6.7%)      376.64      (6.9%)    0.4% ( -12% -   15%) 0.865
   BrowseMonthTaxoFacets        0.60      (5.2%)        0.60      (5.6%)    0.4% (  -9% -   11%) 0.806
        HighSloppyPhrase        2.11      (5.7%)        2.12      (3.5%)    0.5% (  -8% -   10%) 0.730
    BrowseDateTaxoFacets        0.57      (5.1%)        0.58      (5.5%)    0.5% (  -9% -   11%) 0.746
BrowseDayOfYearTaxoFacets        0.57      (5.2%)        0.58      (5.5%)    0.6% (  -9% -   11%) 0.742
           OrNotHighHigh      391.99      (5.2%)      394.52      (5.3%)    0.6% (  -9% -   11%) 0.701
                PKLookup      113.96      (3.1%)      115.11      (3.4%)    1.0% (  -5% -    7%) 0.320
   BrowseMonthSSDVFacets        2.62      (1.0%)        2.69      (1.8%)    3.0% (   0% -    5%) 0.000
BrowseDayOfYearSSDVFacets        2.46      (0.3%)        2.54      (2.7%)    3.3% (   0% -    6%) 0.000

Since, the previous results did not show any gain in the Browse*TaxoFacets tasks, I realized that maybe it was because the ordinalCache with a stock size of 4000 had all ordinals cached which led to the bulk API behaving the same as the iterative getPath call.

The next benchmark tested the getTopChildren call with 10,000 children

luceneutil that tests the top 10000 children:

                    TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                  Fuzzy2       35.18      (4.4%)       34.81      (7.4%)   -1.0% ( -12% -   11%) 0.588
       HighTermMonthSort       47.71     (12.1%)       47.28     (11.8%)   -0.9% ( -22% -   26%) 0.812
              AndHighMed       33.50      (3.4%)       33.35      (2.6%)   -0.4% (  -6% -    5%) 0.648
               OrHighMed       31.86      (2.2%)       31.75      (1.9%)   -0.3% (  -4% -    3%) 0.595
    HighIntervalsOrdered        2.59      (1.8%)        2.58      (1.4%)   -0.3% (  -3% -    2%) 0.540
               OrHighLow      175.82      (2.7%)      175.43      (3.4%)   -0.2% (  -6% -    5%) 0.816
             AndHighHigh       29.36      (3.9%)       29.33      (2.7%)   -0.1% (  -6% -    6%) 0.915
         MedSloppyPhrase       18.63      (2.0%)       18.61      (2.6%)   -0.1% (  -4% -    4%) 0.909
              TermDTSort       37.97     (10.0%)       37.97     (10.2%)    0.0% ( -18% -   22%) 0.997
                  Fuzzy1       24.97      (4.4%)       24.97      (4.1%)    0.0% (  -8% -    8%) 0.983
             MedSpanNear       51.30      (2.0%)       51.35      (1.6%)    0.1% (  -3% -    3%) 0.860
   HighTermDayOfYearSort       45.86     (16.5%)       45.92     (16.4%)    0.1% ( -28% -   39%) 0.980
              OrHighHigh        8.55      (2.1%)        8.56      (1.9%)    0.1% (  -3% -    4%) 0.826
            HighSpanNear        9.30      (1.7%)        9.32      (1.7%)    0.2% (  -3% -    3%) 0.766
    BrowseDateTaxoFacets        0.58      (4.7%)        0.58      (4.8%)    0.2% (  -8% -   10%) 0.915
   BrowseMonthTaxoFacets        0.60      (4.9%)        0.60      (4.8%)    0.2% (  -9% -   10%) 0.894
BrowseDayOfYearTaxoFacets        0.57      (4.7%)        0.58      (4.8%)    0.3% (  -8% -   10%) 0.861
                PKLookup      116.98      (3.6%)      117.30      (3.1%)    0.3% (  -6% -    7%) 0.794
         LowSloppyPhrase        5.58      (4.0%)        5.60      (3.9%)    0.3% (  -7% -    8%) 0.785
             LowSpanNear       11.11      (1.6%)       11.15      (1.7%)    0.4% (  -2% -    3%) 0.451
    HighTermTitleBDVSort       25.79      (7.2%)       25.89      (8.7%)    0.4% ( -14% -   17%) 0.875
                  IntNRQ       25.27      (1.5%)       25.37      (1.3%)    0.4% (  -2% -    3%) 0.347
               LowPhrase       40.67      (2.2%)       40.84      (1.4%)    0.4% (  -3% -    4%) 0.478
        HighSloppyPhrase        5.99      (4.1%)        6.01      (4.1%)    0.4% (  -7% -    8%) 0.745
                Wildcard       20.27      (3.5%)       20.41      (3.4%)    0.7% (  -5% -    7%) 0.500
                 Respell       34.15      (2.1%)       34.41      (1.4%)    0.8% (  -2% -    4%) 0.175
              AndHighLow      358.53      (3.8%)      361.84      (3.8%)    0.9% (  -6% -    8%) 0.438
            OrNotHighLow      369.77      (4.3%)      374.69      (2.7%)    1.3% (  -5% -    8%) 0.239
                 MedTerm      980.66      (4.2%)      994.40      (5.2%)    1.4% (  -7% -   11%) 0.347
           OrHighNotHigh      388.09      (4.6%)      394.10      (4.0%)    1.5% (  -6% -   10%) 0.258
            OrNotHighMed      419.20      (3.9%)      426.30      (3.9%)    1.7% (  -5% -    9%) 0.173
               MedPhrase      281.59      (4.2%)      287.03      (3.3%)    1.9% (  -5% -    9%) 0.107
                 LowTerm      758.64      (5.7%)      775.00      (4.9%)    2.2% (  -8% -   13%) 0.200
              HighPhrase       69.25      (6.5%)       70.85      (6.2%)    2.3% (  -9% -   16%) 0.250
           OrNotHighHigh      517.94      (5.7%)      530.26      (3.5%)    2.4% (  -6% -   12%) 0.111
            OrHighNotMed      438.39      (5.5%)      449.26      (3.8%)    2.5% (  -6% -   12%) 0.096
                HighTerm      650.61      (4.9%)      667.25      (4.4%)    2.6% (  -6% -   12%) 0.081
            OrHighNotLow      308.34      (4.3%)      317.07      (5.0%)    2.8% (  -6% -   12%) 0.053
                 Prefix3       88.13      (8.6%)       91.32      (8.5%)    3.6% ( -12% -   22%) 0.181
   BrowseMonthSSDVFacets        2.61      (1.0%)        2.71      (1.4%)    3.9% (   1% -    6%) 0.000
BrowseDayOfYearSSDVFacets        2.46      (0.4%)        2.56      (2.1%)    4.1% (   1% -    6%) 0.000

I surprisingly see some gain in BrowseDayOfYearSSDVFacets and BrowseMonthSSDVFacets tasks but I think that is noise (because taxonomy facets have a different impl. as compared to SortedSetDocValuesFacets).
Edit: Did an A/A test and BrowseDayOfYearSSDVFacets does not show any gain. So the BrowseDayOfYearSSDVFacets 4.1% perf improvement is unexplained..
Browse*TaxoFacets tasks don't show gain.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.

…more work

…working

mikemccand · 2021-06-27T00:55:09Z

That is really odd that the SSDVFacets show ~4% gain! I wonder if there is a bug in luceneutil mis-labeling which facets task is using which impl!? You could run a benchmark, filtering tasks to only do these SSDVFacets tasks, and then check whether the taxo facet impls are being used at all maybe.

mikemccand

I think this is close! I left a bunch of small comments. The performance is odd but I don't think that should block this improvement.

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

...e/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestDirectoryTaxonomyReader.java

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FloatTaxonomyFacets.java

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

mikemccand · 2021-06-27T00:51:26Z

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

+        /*
+        If ordinals[i] >= leafReaderMaxDoc then we find the next leaf that contains our ordinal
+         */
+        if (values == null || ordinals[i] >= leafReaderMaxDoc) {


Hmm, don't we need to compare ordinals[i] >= leafReaderMaxDoc + leafReaderDocBase instead? I.e. the ordinal is in the global IndexReader docid space, but the leafReaderMaxDoc is just for this leaf. How come tests are not failing due to this :)

This was a big miss. Thanks for catching this.

Here is why the test was passing:
When the ordinals were all in the first leaf, the ordinal was < leafReaderMaxDoc and the wrong code worked correctly.

When the ordinals started spilling into the second leaf, the wrong code would recalculate the leaf each time and then find the correct Label. Thus were not actually making use of the increasing values of the ordinals (and were recalculating the leafContext each time).
This would go on for all other leaves as well.

I'll rerun benchmarks

gautamworah96 · 2021-08-19T01:53:35Z

Hey @mikemccand , my apologies for leaving this PR hanging! The unexpected benchmark results threw me off..

I've updated the PR to fix the bug you had previously discovered and updated the PR to bring it on par with mainline.

I re-ran benchmarks after fixing the leafReaderDocBase not added to the leafReaderMaxDoc bug:

                    TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
               OrHighLow      394.22      (4.1%)      390.60      (4.7%)   -0.9% (  -9% -    8%) 0.508
           OrHighNotHigh      393.24      (5.0%)      393.11      (4.4%)   -0.0% (  -9% -    9%) 0.982
    HighIntervalsOrdered        2.23      (1.3%)        2.23      (1.2%)    0.1% (  -2% -    2%) 0.768
              HighPhrase      169.85      (4.9%)      170.09      (5.7%)    0.1% (  -9% -   11%) 0.932
    BrowseDateTaxoFacets        0.57      (1.6%)        0.57      (1.6%)    0.2% (  -2% -    3%) 0.684
            HighSpanNear        3.57      (2.2%)        3.58      (2.1%)    0.2% (  -3% -    4%) 0.711
             LowSpanNear       11.32      (1.8%)       11.35      (1.4%)    0.3% (  -2% -    3%) 0.604
            OrNotHighMed      398.53      (6.1%)      399.62      (7.5%)    0.3% ( -12% -   14%) 0.900
             MedSpanNear       11.37      (1.8%)       11.41      (1.2%)    0.3% (  -2% -    3%) 0.522
                 LowTerm      949.41      (5.7%)      952.69      (5.2%)    0.3% (  -9% -   11%) 0.841
              OrHighHigh       27.45      (2.5%)       27.55      (2.1%)    0.4% (  -4% -    5%) 0.615
                  IntNRQ       19.98     (30.1%)       20.06     (29.6%)    0.4% ( -45% -   85%) 0.967
                 Respell       24.48      (2.4%)       24.58      (2.5%)    0.4% (  -4% -    5%) 0.575
         LowSloppyPhrase       19.15      (2.6%)       19.23      (2.3%)    0.4% (  -4% -    5%) 0.569
               OrHighMed       23.02      (2.9%)       23.14      (1.8%)    0.5% (  -4% -    5%) 0.482
BrowseDayOfYearTaxoFacets     6311.77      (3.7%)     6349.39      (4.2%)    0.6% (  -7% -    8%) 0.634
                  Fuzzy1       42.80      (3.9%)       43.06      (5.0%)    0.6% (  -7% -    9%) 0.666
            OrHighNotMed      446.39      (5.5%)      449.32      (5.6%)    0.7% (  -9% -   12%) 0.707
        HighSloppyPhrase        3.57      (4.4%)        3.59      (3.8%)    0.7% (  -7% -    9%) 0.610
            OrNotHighLow      418.72      (3.6%)      421.68      (3.6%)    0.7% (  -6% -    8%) 0.535
               LowPhrase       12.47      (2.8%)       12.56      (2.8%)    0.8% (  -4% -    6%) 0.391
            OrHighNotLow      494.57      (4.9%)      498.46      (5.4%)    0.8% (  -9% -   11%) 0.629
   HighTermDayOfYearSort        7.07     (12.8%)        7.13     (10.2%)    0.9% ( -19% -   27%) 0.799
         MedSloppyPhrase       21.85      (2.8%)       22.06      (2.5%)    0.9% (  -4% -    6%) 0.266
                  Fuzzy2       35.04      (3.6%)       35.43      (3.7%)    1.1% (  -5% -    8%) 0.330
    HighTermTitleBDVSort       39.93      (8.7%)       40.40      (6.9%)    1.2% ( -13% -   18%) 0.638
                Wildcard       38.16      (4.4%)       38.60      (3.9%)    1.2% (  -6% -    9%) 0.370
              AndHighMed       47.87      (2.7%)       48.45      (3.6%)    1.2% (  -4% -    7%) 0.224
           OrNotHighHigh      367.90      (4.8%)      372.56      (3.9%)    1.3% (  -7% -   10%) 0.362
             AndHighHigh       20.63      (3.2%)       20.94      (4.6%)    1.5% (  -6% -    9%) 0.221
       HighTermMonthSort       25.15     (12.1%)       25.54     (13.8%)    1.6% ( -21% -   31%) 0.703
               MedPhrase       54.64      (6.1%)       55.54      (6.9%)    1.6% ( -10% -   15%) 0.424
              TermDTSort       16.01     (14.6%)       16.28     (16.7%)    1.7% ( -25% -   38%) 0.737
   BrowseMonthSSDVFacets        2.62      (0.9%)        2.66      (2.1%)    1.7% (  -1% -    4%) 0.001
              AndHighLow      252.16      (3.3%)      256.71      (3.3%)    1.8% (  -4% -    8%) 0.086
                HighTerm      998.70      (6.8%)     1020.66      (5.8%)    2.2% (  -9% -   15%) 0.271
   BrowseMonthTaxoFacets     6354.02      (5.9%)     6495.78      (5.4%)    2.2% (  -8% -   14%) 0.211
                PKLookup      113.56      (4.2%)      116.24      (4.4%)    2.4% (  -6% -   11%) 0.084
BrowseDayOfYearSSDVFacets        2.46      (0.5%)        2.52      (2.9%)    2.5% (   0% -    5%) 0.000
                 MedTerm      815.18      (4.9%)      837.70      (6.0%)    2.8% (  -7% -   14%) 0.109
                 Prefix3      144.90     (10.0%)      149.12     (11.5%)    2.9% ( -16% -   27%) 0.392

The gain in taxonomy based facets is about 2%. I ran it multiple times locally and it gave around 2% gain in those runs as well. SSDV facets still show a 3% gain which is unexplained.

mikemccand

Thanks for the iteration @gautamworah96 -- we are getting closer!

lucene/CHANGES.txt

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FloatTaxonomyFacets.java

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

...e/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestDirectoryTaxonomyReader.java

contention). Modified the test case to be super random (threads, iterations etc). Modified TaxonomyReader with a getBulkPath method that is overridden in DirTaxonomyReader. Improved documentation, improved method signatures to use the newer (...) style syntax, fixed the CHANGES entry

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

mikemccand

This looks super close! I left some small comments.

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyReader.java

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

...e/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestDirectoryTaxonomyReader.java

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

mikemccand

Looks great -- I left tiny polish comments -- almost done! Thanks @gautamworah96!

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

mikemccand · 2021-08-28T15:03:43Z

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java

+          /*
+          If the index is constructed with the older StoredFields it will not have any BinaryDocValues field and will return null
+           */
+          if (values == null) {


It's sort of weird to do this check per-segment, since we can know up front, based on indexed created version, whether it's stored fields or doc values? I.e. this is not a segment by segment decision. But let's do it in follow-on issue. I think this one is ready!

Sure. I'll create it after this PR is merged. Otherwise, it just feels weird to create an issue for a fix in a piece of code that has not been merged :D

lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyCombined.java

...ne/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestBackwardsCompatibility.java

mikemccand

Thanks @gautamworah96 -- looks great! I'll try to merge soon! Then we should watch nightly Lucene benchmarks and see if facets got a bit faster.

mikemccand · 2021-09-02T11:55:18Z

Merged! @gautamworah96 I think we should backport this one to 8.x? It is just adding a faster API.

Resolved Conflicts: gradle/validation/validate-source-patterns.gradle solr/core/src/java/org/apache/solr/core/CoreContainer.java

gautamworah96 · 2021-09-03T16:44:30Z

I think we should backport this one to 8.x

This change relies on the older repo having the "LUCENE-9450 Switch taxonomy index to use BinaryDocValues" change.

We had initially decided that we will not backport it to 8x because it was a major change to the index type/format. Thus this change also can't be backported :/

mikemccand · 2021-09-09T13:31:50Z

I think we should backport this one to 8.x

This change relies on the older repo having the "LUCENE-9450 Switch taxonomy index to use BinaryDocValues" change.

We had initially decided that we will not backport it to 8x because it was a major change to the index type/format. Thus this change also can't be backported :/

Ahh, yes, sorry! We cannot backport this change then. So this is 9.0 only optimization.

Gautam Worah added 4 commits June 7, 2021 10:46

lots of wip, merged in changes from upstream, test is failing, needs …

6924371

…more work

Logic working correctly, only benchmarks remaining, backwards compat …

7b44f1f

…working

Add minor comment

f628896

Merge branch 'main' of https://github.com/apache/lucene into LUCENE-9476

be741ea

gautamworah96 changed the title ~~LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyWriter~~ LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader Jun 10, 2021

mikemccand reviewed Jun 27, 2021

View reviewed changes

Gautam Worah added 5 commits July 7, 2021 17:32

Fixed the bug of not adding leafReaderDocBase to leafReaderMaxDoc

11fd008

Spotless fix

1cc34a8

small typo fix

2f4d4ba

Merge branch 'main' of https://github.com/apache/lucene into LUCENE-9476

cd552ce

Merge from main, update back compat test with latest changes

577ff89

gautamworah96 requested a review from mikemccand August 19, 2021 01:53

mikemccand reviewed Aug 20, 2021

View reviewed changes

Gautam Worah added 2 commits August 26, 2021 15:24

Remove casting from IntTaxonomyFacets and FloatTaxonomyFacets

a9b66f2

gautamworah96 requested a review from mikemccand August 26, 2021 22:59

sonatype-lift bot reviewed Aug 26, 2021

View reviewed changes

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java Show resolved Hide resolved

sonatype-lift bot reviewed Aug 26, 2021

View reviewed changes

lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java Show resolved Hide resolved

mikemccand reviewed Aug 27, 2021

View reviewed changes

Minor fixes

a257c36

gautamworah96 requested a review from mikemccand August 27, 2021 18:05

mikemccand approved these changes Aug 28, 2021

View reviewed changes

Modify CHANGES entry to reflect the small API change in getPath.

a68404f

gautamworah96 requested a review from mikemccand August 29, 2021 22:35

mikemccand approved these changes Aug 31, 2021

View reviewed changes

Merge branch 'main' of https://github.com/apache/lucene into LUCENE-9476

9b33f5b

mikemccand merged commit 1036c70 into apache:main Sep 2, 2021

gautamworah96 deleted the LUCENE-9476 branch September 7, 2021 17:57

asfimport mentioned this pull request Aug 24, 2022

getPath in DirectoryTaxonomyReader should throw an exception [LUCENE-9460] #10500

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader #179

LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader #179

gautamworah96 commented Jun 10, 2021 •

edited

mikemccand commented Jun 27, 2021

mikemccand left a comment

mikemccand Jun 27, 2021

gautamworah96 Jun 30, 2021

gautamworah96 commented Aug 19, 2021

mikemccand left a comment

mikemccand left a comment

mikemccand left a comment

mikemccand Aug 28, 2021

gautamworah96 Aug 29, 2021 •

edited

mikemccand left a comment

mikemccand commented Sep 2, 2021

gautamworah96 commented Sep 3, 2021

mikemccand commented Sep 9, 2021

LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader #179

LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader #179

Conversation

gautamworah96 commented Jun 10, 2021 • edited

Description

Solution

Tests

Checklist

mikemccand commented Jun 27, 2021

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand Jun 27, 2021

Choose a reason for hiding this comment

gautamworah96 Jun 30, 2021

Choose a reason for hiding this comment

gautamworah96 commented Aug 19, 2021

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand Aug 28, 2021

Choose a reason for hiding this comment

gautamworah96 Aug 29, 2021 • edited

Choose a reason for hiding this comment

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand commented Sep 2, 2021

gautamworah96 commented Sep 3, 2021

mikemccand commented Sep 9, 2021

gautamworah96 commented Jun 10, 2021 •

edited

gautamworah96 Aug 29, 2021 •

edited