Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

Invalid syntax in run_distributed_engines.py #549

Open
Mak-Ta-Reque opened this issue May 23, 2022 · 1 comment
Open

Invalid syntax in run_distributed_engines.py #549

Mak-Ta-Reque opened this issue May 23, 2022 · 1 comment

Comments

@Mak-Ta-Reque
Copy link

Instructions To Reproduce the 馃悰 Bug:

  1. what changes you made (git diff) or what code you wrote
diff --git a/dev/launch_slurm.sh b/dev/launch_slurm.sh
index 193b09e..60b3e9d 100755
--- a/dev/launch_slurm.sh
+++ b/dev/launch_slurm.sh
@@ -27,7 +27,7 @@ CFG=( "$@" )
 
 # create a temporary experiment folder to run the SLURM job in isolation
 RUN_ID=$(date +'%Y-%m-%d-%H-%M-%S')
-EXP_ROOT_DIR="/checkpoint/$USER/vissl/$RUN_ID"
+EXP_ROOT_DIR="/netscratch/kadir/slurm-training/checkpoint/$USER/vissl/$RUN_ID"
 CHECKPOINT_DIR=${CHECKPOINT_DIR:-"$EXP_ROOT_DIR/checkpoints/"}
 
 echo "EXP_ROOT_DIR: $EXP_ROOT_DIR"

  1. what exact command you run:
    cd $HOME/vissl && NODES=8 NUM_GPU=8 GPU_TYPE=V100 MEM=200g CPU=8 EXPT_NAME=swav_100ep_rn50_in1k OUTPUT_DIR=/tmp/swav/ PARTITION=learnfair BRANCH=v0.1.6 NUM_DATA_WORKERS=4 MULTI_PROCESSING_METHOD=forkserver ./dev/launch_slurm.sh config=pretrain/swav/swav_8node_resnet config.OPTIMIZER.num_epochs=100 config.SLURM.USE_SLURM=true
  2. what you observed (including full logs):
EXP_ROOT_DIR: /netscratch/kadir/slurm-training/checkpoint/kadir/vissl/2022-05-23-11-25-25
CHECKPOINT_DIR: /netscratch/kadir/slurm-training/checkpoint/kadir/vissl/2022-05-23-11-25-25/checkpoints/
  File "/netscratch/kadir/slurm-training/checkpoint/kadir/vissl/2022-05-23-11-25-25/tools/run_distributed_engines.py", line 23
    def hydra_main(overrides: List[Any]):
                            ^
SyntaxError: invalid syntax
  1. please simplify the steps as much as possible so they do not require additional resources to
    run, such as a private dataset.

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

Environment:

Provide your environment information using the following command:

wget -nc -q https://github.com/facebookresearch/vissl/raw/main/vissl/utils/collect_env.py && python collect_env.py
-------------------  ------------------------------------------------------------------------
sys.platform         linux
Python               3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0]
numpy                1.19.5
Pillow               8.4.0
vissl                0.1.6 @/home/kadir/vissl/vissl
GPU available        False
torchvision          0.11.2+cu102 @/home/kadir/.local/lib/python3.6/site-packages/torchvision
hydra                1.0.7 @/home/kadir/.local/lib/python3.6/site-packages/hydra
apex                 unknown
PyTorch              1.10.1+cu102 @/home/kadir/.local/lib/python3.6/site-packages/torch
PyTorch debug build  False
-------------------  ------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

CPU info:
-------------------  ------------------------------
Architecture         x86_64
CPU op-mode(s)       32-bit, 64-bit
Byte Order           Little Endian
CPU(s)               24
On-line CPU(s) list  0-23
Thread(s) per core   2
Core(s) per socket   6
Socket(s)            2
NUMA node(s)         4
Vendor ID            AuthenticAMD
CPU family           21
Model                2
Model name           AMD Opteron(tm) Processor 6348
Stepping             0
CPU MHz              2954.387
CPU max MHz          2800.0000
CPU min MHz          1400.0000
BogoMIPS             5599.95
Virtualization       AMD-V
L1d cache            16K
L1i cache            64K
L2 cache             2048K
L3 cache             6144K
NUMA node0 CPU(s)    0-5
NUMA node1 CPU(s)    6-11
NUMA node2 CPU(s)    12-17
NUMA node3 CPU(s)    18-23
-------------------  ------------------------------

When to expect Triage

1
VISSL devs and contributors aim to triage issues asap however, as a general guideline, we ask users to expect triaging in 1-2 weeks.

@Mak-Ta-Reque Mak-Ta-Reque changed the title Please read & provide the following Invalid syntax in run_distributed_engines.py May 23, 2022
@QuentinDuval
Copy link
Contributor

Hi @Mak-Ta-Reque,

Thank you for using VISSL and reaching to us :)

So at first sight, this error is super low level and weird: it seems like the Python interpreter is not parsing the file correctly. Could you check that running the file with python (without using the launch_slurm.sh) is able to run the file? If not what do you observe?

Thank you,
Quentin

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants