Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic crawl controller #434

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dgoiko
Copy link

@dgoiko dgoiko commented Jan 25, 2020

Github don't seem to detect moved files even if they were detected locally. I tried to do it in separated commits and same bad results. IT looks like everything changed, but it just the parameters added

I just renamed CrawlController to GenericCrawlController and WebCrawler to GenericWebCrawler. I make them parametrizable and created new CrawlController and WebCrawler that extends them to maintain retrocompatibility with everything that considered them non-parametrizable.

With this changes, getCrawlersLocalData can actually return a list of ResultType with no risks (provided all Java Generics considerations are taken into consideration.

Moved CrawlControler logic into a parametrizable version, GenericCrawlControler, which allows to  specify the minimum WebCrawler class allowed.

Declared protected Object collectCrawlerData(CrawlerType crawler) and called it inside start to extract data from crawler, so subclases can override it for custom data extraction based on WebCrawler type.
GenericWebCrawler is parametrizable, so it allows GenericCrawlController to parametrize results aswell.
@dgoiko dgoiko closed this Jan 25, 2020
@dgoiko dgoiko reopened this Jan 25, 2020
@viva-la-v
Copy link

@yasserg Could you merge this pr? Since there's no new release for 2 years after 2018, I'm building and using newest snapshot for myself and I really like this one with generic type. Really appreciate it!

@dgoiko
Copy link
Author

dgoiko commented May 9, 2020

Please, let me know if those commits are hard to read (everything looks deleted and created again, so it may be kinda messy to review), i'll try a different aproach (with more separated commits) so git can detect changes in a better way, so it is "more reviewable". I'll create a new branch from scratch and apply the exact same changes, will not take much time ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants