Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to retrieve docs #8

Open
isoboroff opened this issue May 6, 2019 · 9 comments
Open

Failed to retrieve docs #8

isoboroff opened this issue May 6, 2019 · 9 comments
Labels
bug Something isn't working

Comments

@isoboroff
Copy link

I get the above message if I select a topic ("Women in Parliaments") or try to create a new one, then click the lightbulb icon. I'm trying to run the Docker setup as laid out on the hical.github.io page on athome4.

How do I debug this?

@ammsa
Copy link
Member

ammsa commented May 9, 2019

Hi @isoboroff

I wasn't able to recreate the problem on my machine. Can you run docker-compose -f HiCAL.yml logs -t -f --tail 100 django and try again? This will show if there is any errors happening

@Isminoula
Copy link

Hello,

I have the same problem when trying to use my own document collection.
I have tried using the same filenames and also the same structure as athome4 sample_dataset (see example1 file) On another try I created my own structure and a respective functions.py (see functions_test.py and example2)
Both result in an error when I create a new topic and click on the lightbulb icon:
Failed to retrieve docs.. docno:
{"message": "Error occurred. Please inform study coordinators"}

The query word I use as a seed query is on the documents and the command suggested above does not generate any errors :/
Any idea on what might be the problem?

Thank you!

example.zip

@ammsa
Copy link
Member

ammsa commented Sep 21, 2019

Hi @Isminoula
Can you run docker-compose -f HiCAL.yml logs -t -f --tail 100 django, this will display log messages from django and it would help clarify the problem,

@Isminoula
Copy link

Hello @ammsa thank you for the quick response! Here is the output of the command
hical_log.log

@nims11
Copy link
Contributor

nims11 commented Oct 8, 2019

Hey @Isminoula, Thanks for the logs. There seems to be a restriction in the code which assumes that paragraphs in the tgz are ordered by their parent document ids. I am working on removing these restrictions. Meanwhile, can you reach out to me on nghelani@uwaterloo.ca and I can help you get things working?

@dianalam
Copy link

@nims11 @ammsa Hi! Has there been progress on removing the paragraph ordering restriction? I am getting the same error in the logs (Paragraphs must be in increasing order of their parent document ids) when I try to use my own documents. I even tried creating a smaller .tgz archive where I sorted the paragraph ids (I removed any ids > 9 for simplicity), but am getting the same error.

Here's the sample sorted archive:

>>> tar -tvf test_processed_para.tgz
test_processed_para/2649b0a0.0
test_processed_para/2649b0a0.1
test_processed_para/2649b0a0.2
test_processed_para/2649b0a0.3
test_processed_para/2649b0a0.4
test_processed_para/2649b0a0.5
test_processed_para/2649b0a0.6
test_processed_para/2649b0a0.7
test_processed_para/9c4d2967.0
test_processed_para/9c4d2967.1
test_processed_para/9c4d2967.2
test_processed_para/9c4d2967.3
test_processed_para/9c4d2967.4
test_processed_para/9c4d2967.5
test_processed_para/9c4d2967.6
test_processed_para/9c4d2967.7
test_processed_para/9c4d2967.8
test_processed_para/9c4d2967.9
test_processed_para/a5f47849.0
test_processed_para/a5f47849.1
test_processed_para/a5f47849.2
test_processed_para/a5f47849.3
test_processed_para/a5f47849.4
test_processed_para/a5f47849.5
test_processed_para/a5f47849.6
test_processed_para/bb22ef8e.0
test_processed_para/bb22ef8e.1
test_processed_para/bb22ef8e.2
test_processed_para/bb22ef8e.3
test_processed_para/bb22ef8e.4
test_processed_para/bb22ef8e.5
test_processed_para/bb22ef8e.6

If you could let me know how to fix or work around this issue that would be greatly appreciated! Thanks for your help!

@Isminoula
Copy link

Isminoula commented Oct 30, 2019

Hi everyone,

Sorry for the late reply, it is close to PhD defense time so I completely forgot this issue due to an overwhelming schedule.
Although I am not 100% sure that this is the correct way to go about his, I did bypass the problem by deleting the if statement in these lines...

@nims11
Copy link
Contributor

nims11 commented Oct 30, 2019

@Isminoula While it will remove the error, it will sometimes cause issues when rescoring items. There is an efficiency logic which uses binary search to move around that order.

@Isminoula @dianalam I have pushed a fix in a branch (https://github.com/hical/HiCAL/tree/fix-para-ordering). I will run some further tests before merging to master but it will be helpful if one of you could also try that branch out.

@nims11 nims11 added the bug Something isn't working label Oct 30, 2019
@dianalam
Copy link

dianalam commented Nov 7, 2019

@nims11 Thanks for the quick response and the fix! I tested it on my dataset and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants