Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help in understanding output file format #72

Open
asafalina opened this issue Oct 16, 2022 · 1 comment
Open

help in understanding output file format #72

asafalina opened this issue Oct 16, 2022 · 1 comment

Comments

@asafalina
Copy link

asafalina commented Oct 16, 2022

Hi

I was running cleora using the command below:

cleora-v1.2.3-x86_64-apple-darwin --columns transient::cluster_id StarNode --dimension 1024 -n 5 --input fb_cleora_input_star.txt -o output

I got something similar to the following output:
(I added some spacing just for better readability)

39361 1024
1        1    0.029419877 ..... -0.0073362226
16260    7    0.033474464 ..... -0.00906976
.
.
.
22459    1    0.010709517 ..... 0.026430061

I cant figure out what does the 1st (1, 16260, ..., 22459) and the 2nd (1, 7, ..., 1) columns represent?

Thanks

@piobab
Copy link
Contributor

piobab commented Oct 21, 2022

Hi @asafalina !

First column - entity. In your case it should be cluster_id.
Second column - occurrence, how many times entity occurs in the data.

https://github.com/Synerise/cleora/blob/master/src/persistence.rs#L44

Hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants