Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SearchIndex: Implement Collections (WIP) #3810

Draft
wants to merge 9 commits into
base: searchIndex-restructure
Choose a base branch
from

Conversation

splitbrain
Copy link
Collaborator

This is another gradual improvement on top of #3556 and #3557.

This introduces the a collection that is meant to replace the current fulltext index. Again this is pretty much a work in progress but should show how this could work in the future.

Indexes themselves have evolved a bit more and provide a way to look up entries by regular expression.

A collection basically manages access to all the the involved indexes. It decides if an index is to be accessed as memory or file index. It also provides mechanisms to work on the collection, eg add or update data in it. Currently it is all implemented in a FulltextCollection which is our most complex index.

The FulltextCollection class still contains a bunch of stuff copied over from our old code that needs to be refactored. That's probably the next step. Once that is done I think it would make sense to move out some of the functionality into a reusable base class.

This PR is mostly to ensure I won't forget what I did here a couple of weeks ago.

Note: we should probably cherry-pick 12ebce9 into master since test autoloading is currently broken

MemoryIndexes only need to be saved back when they have been modified
This collection is meant as a base class for fulltext indexes, a page
specific implementation will follow.

Currently contains lots of dead code that needs to be removed or
replaced.
Namespace based loading for core tests  did not work as intended (but hasn't
been used so far)
Many tests are the same for File and Memory indexes
This finalizes the FulltextCollection and FulltextCollectionSearch
classes. Proper locking is implemented, tests have been enhanced.

It should be possible to reimplement the page full text search on top of
it.
@splitbrain
Copy link
Collaborator Author

Much more work has been put into this. The FulltextCollection and related stuff is now in a state that should make it possible to implement our actual search on top of it.

I added a concept description to the directory. This should be updated with future work and be transferred into the wiki on merge.

However there's more to do. The next steps can easily be found with a grep on todo|fixme.

Most importantly the CollectionConcept should be generalized for non-frequency-based indexes.

}

foreach (explode(':', $record) as $row) {
list($tokenLength, $tokenId) = explode('*', $row);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
list($tokenLength, $tokenId) = explode('*', $row);
[$tokenLength, $tokenId] = explode('*', $row);

i think php 7.2 allows this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants