Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null pointer when trying to load data using latest release #19

Open
yeikel opened this issue Dec 11, 2018 · 6 comments
Open

Null pointer when trying to load data using latest release #19

yeikel opened this issue Dec 11, 2018 · 6 comments

Comments

@yeikel
Copy link

yeikel commented Dec 11, 2018

I am using the following release

And I am trying the jedaiDesktopApp-1.1.jar with the following datasets (from the samples) :

abtBuyIdDuplicates (for D1)
abtBuyProfiles (for truth file)

image

But I get the following error :

image

I tried with CSV files and I also get the same error

@yeikel yeikel changed the title Null pointer using latest release Null pointer when trying to load datausing latest release Dec 11, 2018
@yeikel yeikel changed the title Null pointer when trying to load datausing latest release Null pointer when trying to load data using latest release Dec 11, 2018
@gpapadis
Copy link
Collaborator

Hi! The serialized datasets are incompatible with older versions, due to a change in some Java classes. Try using the latest version and let me know if the problem is fixed.

@yeikel
Copy link
Author

yeikel commented Dec 20, 2018

Hi! I am using the latest release.

I also tried csv files and I received the same errors

@leots
Copy link

leots commented Dec 24, 2018

Hello yeikel,

Can you please tell me which CSV files did you try exactly, so I can test it?

Anyhow, the latest release we have on Github right now (the one you linked above) is not the latest version of the code, so this is why @gpapadis refers to it as an older version.

The current version of the code is still a work in progress, which is why we haven't put up a full "release" on Github yet. However, you can find a build of it here: https://drive.google.com/open?id=1W-ffcQZWnw0MIWluaBzyApsa7nqq5wWB, or build it yourself from the repositories.

This version should read the serialized files directly, and it also allows you to configure some options for how to read the CSV, such as the delimiter, which could fix the problem you are having with the CSV files too.

@yeikel
Copy link
Author

yeikel commented Dec 30, 2018

@leots Where can I find sample CSV files? And their format?

Unless I am using the wrong files/configuration , I tried the serialized samples included in the documentation but they fail. :

image
image

@gpapadis
Copy link
Collaborator

The problem with the serialized datasets is that you use the groundtruth file (abtBuyIdDuplicates) in the place of "Entity Profiles D1" and the profiles size as the "Ground-truth file". It should be the other way round.
CSV datasets are available here: https://dbs.uni-leipzig.de/en/research/projects/object_matching/fever/benchmark_datasets_for_entity_resolution

@leots
Copy link

leots commented Dec 31, 2018

One more thing that I can see, is that you are selecting clean-clean entity resolution but haven't selected a 2nd entity profiles dataset, so make sure you either select dirty ER or add a 2nd dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants