You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @ahlesen, thanks for catching this! I looked into it and seems like a somewhat unusual case:
37387939 has the wrong ACL ID (off by one). The correct ACL ID is D17-1280. Something must've happened during our data crawl.
28752386 has the correct ACL ID.
10822819 is a tricky case & I need to think about how to handle it. It looks like our crawler found a different version of the 28752386 paper from a department website, so the clustering decided to treat them as separate papers.
Anyways, can I get a sense of how serious this issue is for you? Given the scope of corpus, there will always be errors such as this, so trying to get a sense of how much this is impacting your use case?
Good day,
Some different papers have the same acl_id.
The text was updated successfully, but these errors were encountered: