Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] collection config not always applied as expected #5099

Open
line-o opened this issue Oct 23, 2023 · 7 comments
Open

[BUG] collection config not always applied as expected #5099

line-o opened this issue Oct 23, 2023 · 7 comments
Labels
bug issue confirmed as bug

Comments

@line-o
Copy link
Member

line-o commented Oct 23, 2023

Describe the bug

From the feedback I gathered from long-standing core developer colleagues:

Whenever a collection configuration changes, wether by storing copying or moving the configuration to a configuration collection, both of following must be true:

  1. xmldb:reindex($collection) has to be called explicitly to apply new indexes to existing data
  2. new indexes are applied to new data immediately

The first statement is not satisfied - tested in exist-db versions 6.2.0 and 7.0.0-SNAPSHOT.

  • For copied or moved collection configuration files reindexing will not apply the new indexes to existing data.
  • After storing a collection configuration the new indexes are applied immediately to existing data.

NOTE: I have not yet tested the second statement.

Expected behavior

As an application developer, I expect that indexes in a collection configuration files are applied in the same way regardless of the way those changes are made.

To Reproduce

initial testsuite first attempt at a testsuite that would highlight both observed inconsistencies left in just for completeness
module namespace taic="http://exist-db.org/xquery/range/test/apply-index-configuration";

import module namespace test="http://exist-db.org/xquery/xqsuite" at "resource:org/exist/xquery/lib/xqsuite/xqsuite.xql";

declare namespace stats="http://exist-db.org/xquery/profiling";

declare variable $taic:collection-name := 'apply-index-configuration';
declare variable $taic:collection-path := '/db/' || $taic:collection-name;
declare variable $taic:system-config-path := '/db/system/config';
declare variable $taic:collection-config-path := $taic:system-config-path || $taic:collection-path;


declare variable $taic:xconf-name := 'collection.xconf';
declare variable $taic:xconf :=
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <range>
            <create qname="node" type="xs:string"/>
        </range>
    </index>
</collection>
;

declare variable $taic:test-data :=
<root>
    <node>a</node>
    <node>b</node>
</root>
;

declare
    %test:setUp
function taic:setup () {
    xmldb:create-collection('/db', $taic:collection-name),
    xmldb:store($taic:collection-path, 'test.xml', $taic:test-data),
    xmldb:create-collection($taic:system-config-path || '/db', $taic:collection-name),
    xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
};

declare
    %private
function taic:query () {
    collection($taic:collection-path)//node[.='a']
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:copy () {
    let $_ :=
        xmldb:copy-resource(
            $taic:collection-path, $taic:xconf-name,
            $taic:collection-config-path, $taic:xconf-name)
    
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:touch-after-copy () {
    let $_ :=
        xmldb:copy-resource(
            $taic:collection-path, $taic:xconf-name,
            $taic:collection-config-path, $taic:xconf-name)
    let $touch := xmldb:touch($taic:collection-config-path, $taic:xconf-name)
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:copy-with-reindex () {
    let $_ :=
        xmldb:copy-resource(
            $taic:collection-path, $taic:xconf-name,
            $taic:collection-config-path, $taic:xconf-name)
    let $reindex := xmldb:reindex($taic:collection-path)
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:move () {
    let $_ := (
        xmldb:move($taic:collection-path, $taic:collection-config-path, $taic:xconf-name),
        (: restore moved resource for following tests :)
        xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
    )
    
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:touch-after-move () {
    let $_ := (
        xmldb:move($taic:collection-path, $taic:collection-config-path, $taic:xconf-name),
        (: restore moved resource for following tests :)
        xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
    )
    let $touch := xmldb:touch($taic:collection-config-path, $taic:xconf-name)
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:move-with-reindex () {
    let $_ := (
        xmldb:move($taic:collection-path, $taic:collection-config-path, $taic:xconf-name),
        (: restore moved resource for following tests :)
        xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
    )
    let $reindex := xmldb:reindex($taic:collection-path)    
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:store () {
    let $_ := xmldb:store($taic:collection-config-path, $taic:xconf-name, $taic:xconf)

    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:store-with-reindex () {
    let $_ := xmldb:store($taic:collection-config-path, $taic:xconf-name, $taic:xconf)
    let $reindex := xmldb:reindex($taic:collection-path)
    return taic:query()
};

declare
    %test:tearDown
function taic:cleanup () {
    xmldb:remove($taic:collection-path),
    xmldb:remove($taic:collection-config-path)
};
--- UPDATE 1: test copied xconf ---

The following testsuite proves, that the indexes defined in a collection configuration resource copied to a configuration collection cannot be applied to existing data.
Better isolation of tests operating on a copied xconf resource. They all fail.

module namespace aicc="http://exist-db.org/xquery/range/test/apply-index-copied-configuration";

import module namespace test="http://exist-db.org/xquery/xqsuite" at "resource:org/exist/xquery/lib/xqsuite/xqsuite.xql";

declare namespace stats="http://exist-db.org/xquery/profiling";

declare variable $aicc:collection-name := 'apply-index-copied-configuration';
declare variable $aicc:collection-path := '/db/' || $aicc:collection-name;
declare variable $aicc:system-config-path := '/db/system/config';
declare variable $aicc:collection-config-path := $aicc:system-config-path || $aicc:collection-path;


declare variable $aicc:xconf-name := 'collection.xconf';
declare variable $aicc:xconf :=
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <range>
            <create qname="node" type="xs:string"/>
        </range>
    </index>
</collection>
;

declare variable $aicc:test-data :=
<root>
    <node>a</node>
    <node>b</node>
</root>
;

declare
    %test:setUp
function aicc:setup () {
    xmldb:create-collection('/db', $aicc:collection-name),
    xmldb:store($aicc:collection-path, 'test.xml', $aicc:test-data),
    xmldb:create-collection($aicc:system-config-path || '/db', $aicc:collection-name),
    xmldb:store($aicc:collection-path, $aicc:xconf-name, $aicc:xconf),
    xmldb:copy-resource(
        $aicc:collection-path, $aicc:xconf-name,
        $aicc:collection-config-path, $aicc:xconf-name)
};

declare
    %private
function aicc:query () {
    collection($aicc:collection-path)//node[.='a']
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:after-copy () {
    aicc:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:touch-after-copy () {
    let $touch := xmldb:touch($aicc:collection-config-path, $aicc:xconf-name)
    return aicc:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:reindex-after-copy () {
    let $reindex := xmldb:reindex($aicc:collection-path)
    return aicc:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:touch-and-reindex-after-copy () {
    let $touch := xmldb:touch($aicc:collection-config-path, $aicc:xconf-name)
    let $reindex := xmldb:reindex($aicc:collection-path)
    return aicc:query()
};

declare
    %test:tearDown
function aicc:cleanup () {
    xmldb:remove($aicc:collection-path),
    xmldb:remove($aicc:collection-config-path)
};
--- UPDATE 2: immediate application ---

The test suite below proves that when a xconf resource is stored the indexes in it will be applied immediately.

module namespace asic="http://exist-db.org/xquery/range/test/apply-stored-index-configuration";

import module namespace test="http://exist-db.org/xquery/xqsuite" at "resource:org/exist/xquery/lib/xqsuite/xqsuite.xql";

declare namespace stats="http://exist-db.org/xquery/profiling";

declare variable $asic:collection-name := 'apply-index-configuration';
declare variable $asic:collection-path := '/db/' || $asic:collection-name;
declare variable $asic:system-config-path := '/db/system/config';
declare variable $asic:collection-config-path := $asic:system-config-path || $asic:collection-path;


declare variable $asic:xconf-name := 'collection.xconf';
declare variable $asic:xconf :=
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <range>
            <create qname="node" type="xs:string"/>
        </range>
    </index>
</collection>
;

declare variable $asic:test-data :=
<root>
    <node>a</node>
    <node>b</node>
</root>
;

declare
    %test:setUp
function asic:setup () {
    xmldb:create-collection('/db', $asic:collection-name),
    xmldb:store($asic:collection-path, 'test.xml', $asic:test-data),
    xmldb:create-collection($asic:system-config-path || '/db', $asic:collection-name),
    xmldb:store($asic:collection-config-path, $asic:xconf-name, $asic:xconf)
};

declare
    %private
function asic:query () {
    collection($asic:collection-path)//node[.='a']
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function asic:store () {
    asic:query()
};

declare
    %test:tearDown
function asic:cleanup () {
    xmldb:remove($asic:collection-path),
    xmldb:remove($asic:collection-config-path)
};

Screenshots
If applicable, add screenshots to help explain your problem.

Context (please always complete the following information)

Build: eXist-7.0.0-SNAPSHOT (b032a42)
Java: 17.0.6 (Azul Systems, Inc.)
OS: Mac OS X 13.5.2 (aarch64)

Additional context

  • How is eXist-db installed? built from source (7.0.0-SNAPSHOT) and run in docker (6.2.0)
  • Any custom changes in e.g. conf.xml? none
@line-o line-o added the bug issue confirmed as bug label Oct 23, 2023
@line-o
Copy link
Member Author

line-o commented Oct 23, 2023

On both tested systems the results were:

<testsuites>
    <testsuite package="http://exist-db.org/xquery/range/test/apply-index-configuration" timestamp="2023-10-23T11:17:42.223+02:00" tests="8" failures="4" errors="0" pending="0" time="PT0.065S">
        <testcase name="copy" class="taic:copy">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:query source="/db/apps/eXide/modules/run-test.xq" elapsed="0.002" calls="1"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="taic:copy" elapsed="0.0" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [55:12]"/>
                    <stats:function name="xmldb:copy-resource" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [51:9]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="copy-with-reindex" class="taic:copy-with-reindex">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:function name="xmldb:reindex" elapsed="0.003" calls="1" source="/db/test-copy-xconf.xq [78:21]"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="taic:copy-with-reindex" elapsed="0.005" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="xmldb:copy-resource" elapsed="0.002" calls="1" source="/db/test-copy-xconf.xq [75:9]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [79:12]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="move" class="taic:move">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [92:12]"/>
                    <stats:function name="taic:move" elapsed="0.001" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="xmldb:move" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [87:9]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:function name="xmldb:store" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [89:9]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="move-with-reindex" class="taic:move-with-reindex">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [118:12]"/>
                    <stats:function name="taic:move-with-reindex" elapsed="0.003" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="xmldb:store" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [115:9]"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="xmldb:move" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [113:9]"/>
                    <stats:function name="xmldb:reindex" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [117:21]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="store" class="taic:store"/>
        <testcase name="store-with-reindex" class="taic:store-with-reindex"/>
        <testcase name="touch-after-copy" class="taic:touch-after-copy"/>
        <testcase name="touch-after-move" class="taic:touch-after-move"/>
    </testsuite>
</testsuites>

@adamretter
Copy link
Member

Copying or moving a collection configuration into the respective configuration collection will not apply the indexes immediately but storing the contents of that file will.

That is expected behaviour and is by design.

@line-o
Copy link
Member Author

line-o commented Oct 23, 2023

Copying or moving a collection configuration into the respective configuration collection will not apply the indexes immediately but storing the contents of that file will.

That is expected behaviour and is by design.

What's the idea behind this?

@line-o
Copy link
Member Author

line-o commented Oct 23, 2023

I would still expect that reindexing would then apply the new index configuration.

@luckydem
Copy link

In my case it makes sense to do this:
My clients store a lot of data that is specific to their organisation and to ensure absolute separation, when we onboard a new client, our system creates a new "master" collection for their data.
We create an index for that collection so that we can index each client separately.
When we onboard a new client, they provide us with a lot of information to start with which we populate that "master" collection with. This includes sub collections and a number of documents.
Every "master" organisation collection uses the same index pattern so we store a template collection.xconf file which I was hoping we could use the xmldb:copy-resource() to copy the file to the /db/system/config/db/apps/path-to-master-org-collection.
Then after running xmldb:reindex('/db/apps/path-to-master-org-collection`) I would have hoped it would use that newly copied collection.xconf to reindex that collection...

I did manage to find a workaround though by using doc() on the template collection.xconf file and then xmldb:store() to copy and then store the collection.xconf file in the correct system collection... This process made the indexing work.

It was just unexpected behaviour. the fact that it didn't let me reindex the collection.xconf file after using xmldb:copy-resource() and the fact that that file still appeared in monex...

@line-o
Copy link
Member Author

line-o commented Oct 23, 2023

In update 2 I can prove that storing an XConf will immediately trigger reindexing of the entire collection.

@line-o
Copy link
Member Author

line-o commented Oct 23, 2023

From the feedback I gathered today:

Whenever a collection configuration changes, wether by storing copying or moving the configuration to a configuration collection, both of following must be true:

  1. xmldb:reindex() has to be called explicitly to apply new indexes to existing data
  2. new indexes are applied to new data immediately

Both statements above are not satisfied in exist-db 6.2.0 up until 7.0.0-SNAPSHOT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issue confirmed as bug
Projects
None yet
Development

No branches or pull requests

3 participants