Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Big Data Technologies #4

Open
mommi84 opened this issue Nov 5, 2015 · 6 comments
Open

Support Big Data Technologies #4

mommi84 opened this issue Nov 5, 2015 · 6 comments

Comments

@mommi84
Copy link
Contributor

mommi84 commented Nov 5, 2015

Can the current workflow deal with big datasets (i.e., when it's impossible to store them in-memory)?

@ngonga
Copy link

ngonga commented Nov 6, 2015

Yes. See memory management package. The mapping class needs to be updated though. We need a file mapping that supports writing mappings to the hard drive.

@ngonga ngonga closed this as completed Nov 6, 2015
@mommi84
Copy link
Contributor Author

mommi84 commented Nov 6, 2015

Okay. I would keep this issue open until the new Mapping class is updated.

@mommi84 mommi84 reopened this Nov 6, 2015
@mommi84 mommi84 added task and removed question labels Nov 6, 2015
@mommi84 mommi84 added this to the Release 1.1 milestone Oct 12, 2016
@kvndrsslr-zz kvndrsslr-zz modified the milestones: Release 1.1, Release 1.2 Feb 1, 2017
@Kleanthi
Copy link
Contributor

Kleanthi commented Mar 2, 2018

How is this thing going?

@dobraczka
Copy link
Contributor

Kevin and I are currently working on porting HR3 to either Flink or Spark. Though this task is certainly smaller than the scope of the original question it might be reasonable to aim for such frameworks rather than having a new Mapping class, i.e. having a LIMES-Flink oder LIMES-Spark implementation, that can be run in a cluster.

@Kleanthi
Copy link
Contributor

Kleanthi commented Mar 2, 2018

I like the LIMES-Spark idea.

@kvndrsslr kvndrsslr modified the milestones: Release 1.2, Release 2.0 Jul 2, 2020
@kvndrsslr
Copy link
Contributor

I did some research on this lately and it seems like Apache BEAM is what we'd want for a complete LIMES port to big data technology.
Will be considered as part of the upcoming rewrite.

@kvndrsslr kvndrsslr changed the title Big datasets Support Big Data Technologies Jul 2, 2020
@kvndrsslr kvndrsslr self-assigned this Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants