Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BatchDocumentInterface seems extraneous #15

Open
chillu opened this issue May 8, 2020 · 4 comments
Open

BatchDocumentInterface seems extraneous #15

chillu opened this issue May 8, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@chillu
Copy link
Member

chillu commented May 8, 2020

I'm assuming that all search implementations we're currently looking at can add multiple docs at the same time, I wouldn't know how a production-ready search would work without that feature. Separating this feature out into a separate interface seems too noisy.

@chillu chillu added the bug Something isn't working label May 8, 2020
@chillu
Copy link
Member Author

chillu commented May 8, 2020

It's also a bit awkwardly named since it's not a specialisation of DocumentInterface in the same namespace, but rather deals with a different concern of SearchService

@unclecheese
Copy link

It's more of just an opportunistic capture of a common pair of methods. SearchServiceInterface has the addDocuments() method and so does BatchProcessor.

There is no real abstraction layer for BatchProcessor, so we could conceivably get rid of the interface, but seems a bit trivial to me.

@chillu
Copy link
Member Author

chillu commented May 26, 2020

I'd prefer if we get rid of it, if only for the simple reason that it's confusing to have BatchDocumentInterface with a very different purpose to DocumentInterface. If we really find a search engine that doesn't allow batch adding documents via addDocuments(), we'll do addDocument() in a foreach loop for that implementation.

@chillu
Copy link
Member Author

chillu commented May 26, 2020

OK trying to get my head around this a bit more.

  • class BatchProcessor implements BatchDocumentInterface has addDocuments(), which indexes documents via queuedjobs if that's configured, or falls back to SearchServiceInterface (which also implements BatchDocumentInterface)
  • interface SearchServiceInterface extends BatchDocumentInterface has addDocuments(), which doesn't set any guidelines on how it would perform the index, but implies that it's synchronous via API calls (?)
  • interface BatchDocumentInterface only returns itself, so it's unclear how the operation is supposed to be implemented. Given the above, this can happen either through queues or through synchronous API calls.

At first I though you could manage to set up an accidental "infinite queue loop" through YAML and DI shenanigans here, where the queue implementation creates jobs, these jobs assume a synchronous API implementation upon execution, but in fact are also set to use the queue implementation, creating more jobs. But IndexJob is using SearchServiceInterface, rather than BatchDocumentInterface, which safeguards against that.

Having an interface that encapsulates "process these docs through a queue or synchronously" is a good thing. After going through this, my main gripe is naming. How about BatchServiceInterface? That has more of the desired symmetry for me: interface SearchServiceInterface extends BatchServiceInterface and class BatchProcessor implements BatchServiceInterface. It's just a more specialised service with less feature surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants