Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThreadLocalRandom() 随机数分布问题 - sample_neighbor_layerwise OP #321

Open
LuconYang opened this issue Feb 22, 2021 · 0 comments
Open

Comments

@LuconYang
Copy link

LuconYang commented Feb 22, 2021

在对 sample_neighbor_layerwise OP测试的时候,发现最终结果和预期存在出入。定位到是 DAG流程中 API_SAMPLE_L,3 中采样邻居的输出节点分布不符合预期。以下是具体结果和复现方法:

  • 测试graph:

base.initialize_graph({ 'mode': 'local', 'data_path': '/tmp/euler', 'sampler_type': 'all', 'data_type': 'all' })

  • 调用函数:

sample_neighbor_layerwise([[1, 2, 3]], ['0', '1'], 10, -1, '')(循环执行10000次)

  • 输出Log打点方法(euler/common/compact_weighted_collection.h):
    在执行过程中,只会涉及到3个点(nodeId 1、nodeId 2、nodeId 3),而每个点在采样邻居时先采样边类型、再采样点。根据它们各自的权重列表不同,分为了6种情形,分别统计每种情形下的随机数情况。
    15b6fa89-1c98-4de0-8980-0a027a6ca064

  • 统计结果:
    将6种情形下各自的随机数落在每个区间的频率情况作了统计。
    截屏2021-02-22 上午11 35 56

  • 问题:
    观察统计结果,在6种情形下,随机数的区间频数分布并不均匀,甚至出现了频数为0的区间,似乎违背了ThreadLocalRandom()的产生均匀随机数的初衷。请问该结果是否合理呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant