Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constructor Design for the library #4

Open
OlehOnyshchak opened this issue Apr 25, 2020 · 0 comments
Open

Constructor Design for the library #4

OlehOnyshchak opened this issue Apr 25, 2020 · 0 comments

Comments

@OlehOnyshchak
Copy link
Owner

Think about the design choices of how to make the library easily extendable. For example, make the query to accept as an argument list of function to process text and images. For example, text handlers can accept HTML of the page and its URL as an input, and then return some key-value pair to be added to the dataset.

With that approach, if a user wants to parse additional field he would only need to define the function which with appropriate parsing and pass it as a parameter to query function, where all the meaty and common processing is done. With that approach, the user can select what to download by modifying the list of pre-created handlers for wikitext or caption parsing. Also, we could have designed an approach to uniformly pass cache-related parameters to such functions.

Might be a very good idea but requires tons of work. Will probably be suspended until some reasonable interest to the script appears.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant