Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create cronjob workflow to index website docs in Algolia #417

Open
SenseException opened this issue Oct 15, 2021 · 3 comments
Open

Create cronjob workflow to index website docs in Algolia #417

SenseException opened this issue Oct 15, 2021 · 3 comments

Comments

@SenseException
Copy link
Member

There is currently no automated build step to index the docs in Algolia for the website search bar. This has the following reasons:

  • There is an easy to reach limit per month for Algolia where the search won't work anymore if this limit is exceeded.
  • There is no reason to build every project's documentation all the time since some don't change very often.

To be able to update the docs regularly and keep the search and its results up-to-date a workflow should be created that builds the indexes at a time before the monthly Algolia limit gets a reset. This way it should be possible to prioritize the users of the search and keep the search availability. Because projects like ORM and DBAL are more frequented than e.g. Annotations, we can also plan different runs for every project in Doctrine to spare Algolia requests.

@greg0ire
Copy link
Member

After reading the code, it seems to me that we only do 1 call to addObjects per project… how low is that limit? One call to that method will only translate into several requests if there are more than 1000 objects (assuming we are using the default batch size: https://github.com/algolia/algoliasearch-client-php/blob/1c9440d8151cc4c9363128145b898946baffcd42/src/Config/SearchConfig.php#L31)

@morozov
Copy link
Member

morozov commented Dec 14, 2021

Given that all the website contents are versioned in Git, instead of building the search index via a cron job, would it make sense to build it based on the diff between the previous and the new website version?

@SenseException
Copy link
Member Author

I haven't taken a look into the search index itself but not every change in the docs would affect the search index. One of my first thoughts was about building the index when a change can be found with a diff but there are usually not that many changes which is why I thought about cronjobs as a first step.

The website code is currently flawed when it comes to indexing for a certain project and version. It currently always deletes the whole index. This needs to be handled first before projects can be reindexed separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants