Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Id 'xxx' is defined more than once in group 'global id space' #2

Open
StevenJack1 opened this issue Apr 1, 2019 · 10 comments
Open

Comments

@StevenJack1
Copy link

StevenJack1 commented Apr 1, 2019

您好,我在将生成的文件导入到neo4j时,出现了以下问题:
image
我在网上搜的时候,说加入--ignore-duplicate-noedes就可以解决重名id,但是之后,又出现了其他问题,请问下怎么回事呢??
还有就是有个疑问,就是在build_executive的时候,会出现重复的personId,请问下这个会有影响吗??

@StevenJack1
Copy link
Author

这个是在加入---ignore-duplicate-noedes之后出现的问题::::
image

@lemonhu
Copy link
Owner

lemonhu commented Apr 1, 2019

谢谢反馈。我已经做过重名处理了,可见build_csv.py#L34,将董事的namegenderage一起做MD5作为唯一ID,以解决重名问题。

@StevenJack1
Copy link
Author

有相同数据,你这样还是会有重复id,比如以下:::
image
image
这个文件是executive_prep.csv中

@lemonhu
Copy link
Owner

lemonhu commented Apr 1, 2019

基于MD5的实体唯一性确定规则,这里的两个姚波应该属于同一个人,不应该有重复的ID(实际上重复也不会有影响)。

@StevenJack1
Copy link
Author

项目成功导入了,但是有几个问题需要修改下,一:executive.csv中有重复的Person行,可以在生成这个文件的地方加以限制,或者在导入到neo4j的时候加入--ignore-duplicate-nodes。二:executive_stock文件中的:END_ID居然生成的是stockpage\xxxxxx,这个在关联的时候找不到stock的id啊,上图吧:::
image
image
所以我在build_executive_stock方法中对end_id做了个切割。
嗯,就这样!

@lemonhu
Copy link
Owner

lemonhu commented Apr 1, 2019

针对你提到的第二个问题,根据程序逻辑应该不会出现,上述问题可以追溯到executive_prep.csv文件的code列,可见extract.py代码文件#L23#L24 ,实际在我本地测试时也未出现此bug哈。

@wangxiaojianim
Copy link

wangxiaojianim commented Jul 17, 2019

@lemonhu 是会出现 @StevenJack1 所说的问题二,build_executive_stock方法有几个小问题

@Lswx2017
Copy link

@StevenJack1 可以提供一下导入的脚本吗?在docker环境下导入时,
Duplicate input ids that would otherwise clash can be put into separate id space, read more about how to use id spaces i
n the manual: https://neo4j.com/docs/operations-manual/3.5/tools/import/file-header-format/#import-tool-id-spaces
Caused by:Too many bad entries 1001, where last one was: Id '82c52f128cb62b8c5d95a59df5c61c20' is defined more than once
in group 'global id space'
的问题跳不过去,加了 --skip-duplicate-nodes 参数。

@sui0yi
Copy link

sui0yi commented Jan 27, 2021

@StevenJack1 您好,请问您说的“二:executive_stock文件中的:END_ID居然生成的是stockpage\xxxxxx,这个在关联的时候找不到stock的id啊,上图吧”,具体是怎么解决的呢?图片看不到呢。

@sui0yi
Copy link

sui0yi commented Jan 31, 2021

@wangxiaojianim 您好,方便告知一下build_executive_stock方法有哪几个小问题吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants