Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement to the test_spot_target_excludes test. #6652

Draft
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

drivanov
Copy link
Contributor

@drivanov drivanov commented Dec 1, 2023

Description

The main goal of these changes is the elimination of the following 15 warnings:

The main purpose of these changes is to address the following 15 warnings:
  /usr/local/lib/python3.10/dist-packages/dgl/dataloading/dataloader.py:1149: DGLWarning: Dataloader CPU affinity opt is not enabled, consider switching it on (see enable_cpu_affinity() or CPU best practices for DGL [https://docs.dgl.ai/tutorials/cpu/cpu_best_practises.html])
    dgl_warning(

generated by the test_spot_target_excludes test. In addition to this, we have extended the list of variable input parameters with num_workers with values of 1 and 4.

Checklist

Please feel free to remove inapplicable items for your PR.

  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 1, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 1, 2023

Commit ID: 799256cf9001183ac607dd2215a0fc2cf157321d

Build ID: 1

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 1, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 1, 2023

Commit ID: a279d3315cde882a7c0ea0a5d98f3fc4276435fb

Build ID: 2

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 1, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 1, 2023

Commit ID: 8a236bc

Build ID: 3

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 9, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 9, 2023

Commit ID: 6894c64

Build ID: 4

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 11, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 11, 2023

Commit ID: 363fdf4

Build ID: 5

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 11, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 11, 2023

Commit ID: d340cc1

Build ID: 6

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 13, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 13, 2023

Commit ID: ce67fe9

Build ID: 7

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 13, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 13, 2023

Commit ID: faae457

Build ID: 8

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 20, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 20, 2023

Commit ID: f0dc7a6

Build ID: 9

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 30, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Dec 30, 2023

Commit ID: c874e95

Build ID: 10

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 10, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 10, 2024

Commit ID: 6612adf

Build ID: 16

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 17, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 17, 2024

Commit ID: 49ee714

Build ID: 17

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 19, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 19, 2024

Commit ID: 0f053e6

Build ID: 18

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 22, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 22, 2024

Commit ID: fb46bb4

Build ID: 19

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 28, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 28, 2024

Commit ID: 99e10df

Build ID: 20

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 1, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 1, 2024

Commit ID: 83fa16a

Build ID: 21

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@frozenbugs
Copy link
Collaborator

@dgl-bot

@frozenbugs frozenbugs self-requested a review February 9, 2024 02:43
@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 9, 2024

Commit ID: 1cae14f54e5253d1d1321c814b430e38036b1fcc

Build ID: 22

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 9, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 9, 2024

Commit ID: b3a66eb

Build ID: 23

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@frozenbugs
Copy link
Collaborator

@dgl-bot

@frozenbugs
Copy link
Collaborator

frozenbugs commented Feb 19, 2024

@drivanov

================================== FAILURES ===================================
______________________ test_spot_target_excludes[1-1-1] _______________________

degree_threshold = 1, batch_size = 1, num_workers = 1

    @pytest.mark.parametrize("degree_threshold", [1, 2, 3, 4, 5])
    @pytest.mark.parametrize("batch_size", [1, 10, 50])
    @pytest.mark.parametrize("num_workers", [1, 4])
    def test_spot_target_excludes(degree_threshold, batch_size, num_workers):
        g, reverse_eids, seed_edges = _create_homogeneous()
        sampler = dgl.dataloading.MultiLayerFullNeighborSampler(1)
        low_degree_excluder = dgl.dataloading.SpotTarget(
            g,
            exclude="reverse_id",
            degree_threshold=degree_threshold,
            reverse_eids=reverse_eids,
        )
        sampler = dgl.dataloading.as_edge_prediction_sampler(
            sampler,
            exclude=low_degree_excluder,
            negative_sampler=dgl.dataloading.negative_sampler.Uniform(1),
        )
        dataloader = dgl.dataloading.DataLoader(
            g,
            seed_edges,
            sampler,
            batch_size=batch_size,
            num_workers=num_workers,
        )
    
        with dataloader.enable_cpu_affinity():
            for i, (input_nodes, pair_graph, neg_pair_graph, blocks) in enumerate(
>               dataloader
            ):

tests\python\pytorch\dataloading\test_spot_target.py:58: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
python\dgl\dataloading\dataloader.py:1160: in __iter__
    self, super().__iter__(), num_threads=num_threads
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py:435: in __iter__
    return self._get_iterator()
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py:381: in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py:1034: in __init__
    w.start()
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py:112: in start
    self._popen = self._Popen(self)
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py:223: in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py:322: in _Popen
    return Popen(process_obj)
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\popen_spawn_win32.py:89: in __init__
    reduction.dump(process_obj, to_child)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

obj = <Process(Process-117, initial daemon)>
file = <_io.BufferedWriter name=11>, protocol = None

    def dump(obj, file, protocol=None):
        '''Replacement for pickle.dump() using ForkingPickler.'''
>       ForkingPickler(file, protocol).dump(obj)
E       AttributeError: Can't pickle local object 'DataLoader.enable_cpu_affinity.<locals>.init_fn'

C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py:60: AttributeError
---------------------------- Captured stdout call -----------------------------
1 DL workers are assigned to cpus [0], main process will use cpus [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
-------------------------- Captured stderr teardown ---------------------------
Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main

    exitcode = _main(fd)

  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 115, in _main

    self = reduction.pickle.load(from_parent)

EOFError: Ran out of input

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 19, 2024

Commit ID: 345d45e1573b024d1d6d027dd751e58508c2e0cf

Build ID: 24

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

@drivanov
Copy link
Contributor Author

@frozenbugs : Unfortunately, I am unable to reproduce the problem you mentioned in my Linux environment. This appears to be some kind of multi-threading bug that only appears on Windows. Similar problems were reported for PR#6187 and PR#6194 where I suggested similar fixes. I'll convert this to "Draft".

@drivanov drivanov marked this pull request as draft February 20, 2024 19:47
@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 20, 2024

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 20, 2024

Commit ID: 2767e32460e234000332f427cc3ae6f3553eed9c

Build ID: 25

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants