Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline and events for racking back links. #404

Open
gedw99 opened this issue Mar 27, 2023 · 1 comment
Open

Pipeline and events for racking back links. #404

gedw99 opened this issue Mar 27, 2023 · 1 comment

Comments

@gedw99
Copy link

gedw99 commented Mar 27, 2023

I have a need to know what is indexed and saw there is a pipeline.

i don’t write rust so can’t change the code easily so I was hoping that the system can through events about what documents it is indexing.

It could just make an event and payload that describes:

  • path
  • File nane
  • fike ext
  • event type

The event type could be:

  • new . File that it has not seen before
  • modified. Fike that has changes.
  • Deleted. Fike that was deleted.

I don’t know if it’s able to know these evebt types?

The reason for this is that then I can integrate it with other processes.

A classics example is links and back links. We can use html as an example by the logic can apply to many document types.

A web psge x.html has an image element. We don’t know the state of that actual image though .

the opposite is also true. When indexing an image we want to know all html pages that use that image.

By throwing events on every document it allows custom logic to be written to keep track of what references what. For example a web page is using image A , but what else is using imsge A ?

An event system can then parse all events and built a view of what is linked to what !!

This is really useful for use cases where you are editing an image but you are not sure how many other documents are using that image.
you can then bring up all documents using that image using search and then decide if you should copy the image of not.

Also when you copy the image you might want 20 documents that use that image to start to use the new image.

It’s really similar to in code where you are refactoring code and you want to find all references to a function.

@travolin
Copy link
Collaborator

Conceptually the pipeline is meant to provide the ability to customize processing of content including collection, parsing and tagging. The pipeline code is still in its early stages and does not yet provide the ability to customize nor does it provide the information you are looking for.

For your use case I can see the need for notification on change, but would the contents be needed also? We provide the ability to custom tag documents, in your example would the documents associated with the same image be tagged with that information for easy search?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants