Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CONCOCT - OpenBLAS Warning #551

Open
Peter-Kille opened this issue Dec 13, 2023 · 3 comments
Open

CONCOCT - OpenBLAS Warning #551

Peter-Kille opened this issue Dec 13, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@Peter-Kille
Copy link

Description of the bug

CONCOCT exceeding 24 runtime. Only action for 20 hours in .command.log and .command.err files these are being updated with the following repeated error:
p and running. Check /mnt/scratch/c1711572/mag_nf/work/df/9ed083848ec2dbe65e17338428a179/MEGAHIT-CONCOCT-group-4_log.txt for progress
/usr/local/lib/python3.11/site-packages/sklearn/utils/validation.py:1858: FutureWarning: Feature names only support names that are all strings. Got feature names with dtypes: ['int', 'str']. An error will be raised in 1.2.
warnings.warn(
/usr/local/lib/python3.11/site-packages/sklearn/utils/validation.py:1858: FutureWarning: Feature names only support names that are all strings. Got feature names with dtypes: ['int', 'str']. An error will be raised in 1.2.
warnings.warn(
Setting 24 OMP threads
Generate input data
OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.
OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.
......
the OpenBLAS warning then repeats x25,200

Command used and terminal output

Script used:

!/bin/bash
#SBATCH --partition=jumbo      # the requested queue
#SBATCH --nodes=1              # number of nodes to use
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=6GB             # in megabytes, unless unit explicitly stated
#SBATCH --error=%J.err         # redirect stderr to this file
#SBATCH --output=%J.out        # redirect stdout to this file
##SBATCH --mail-user=email@cardiff.ac.uk  # email address used for event notification
##SBATCH --mail-type=all 

  
echo "Some Usable Environment Variables:"
echo "================================="
echo "hostname=$(hostname)"
echo \$SLURM_JOB_ID=${SLURM_JOB_ID}


cat $0

module purge

module load nextflow/23.04.1
module load singularity/3.8.7

export NXF_OPTS="-Xms500M -Xmx4G"

workdir="/mnt/scratch/$USER/mag_nf"
reportdir="wtw_Hirwaun_reports"
outputdir="wtw_Hirwaun_mag_output"

mkdir $reportdir

nextflow run mag_2_5_1/ \
         -c cardiff_profile_epyc_slurm_091223 \
         -with-report "${reportdir}/${SLURM_JOB_ID}_report.html" \
         -with-dag "${reportdir}/${SLURM_JOB_ID}_flowchart.png" \
         -with-trace "${reportdir}/${SLURM_JOB_ID}_tracereport.txt" \
         -with-timeline "${reportdir}/${SLURM_JOB_ID}_timeline.html" \
         --gtdb_db '/mnt/scratch/nodelete/nextflow/mag/2.3.0/gtdbtk/gtdbtk_r202_data.tar.gz' \
         --cat_db '/mnt/scratch/nodelete/nextflow/mag/2.3.0/cat_prepare/CAT_prepare_20210107.tar.gz' \
         --checkm_db '/mnt/scratch/nodelete/nextflow/mag/2.3.0/checkm/checkm_data_2015_01_16.tar.gz' \
         --busco_db '/mnt/scratch/nodelete/nextflow/mag/2.3.0/busco' \
         --outdir ${outputdir} \
         --input ${workdir}/Hirwaun_mag.csv \
         --skip_spades \
         --coassemble_group \
         --binning_map_mode all \
         -resume

Relevant files

config_nextflow-log.zip

System information

Nextflow: 23.04.1
Hardware Slurm HPC
Container: Singularity
OS: linux
nf-core/mag 2.5.1

@Peter-Kille Peter-Kille added the bug Something isn't working label Dec 13, 2023
@jfy133
Copy link
Member

jfy133 commented Dec 13, 2023

Hi @Peter-Kille thanks for the report.

Unfortunately we've been very aware of the very slow CONCOCT running time (and one of the authors - @alneberg has acknowledged this, with a few suggestions but I can't find them ATM).

I've personally not seen that particular warning before in other reports however. Generally this would imply there is something funky with the biocontainer.

I'm still on parental leave until January so I can't investigate further updating the container (if that is the source of the issue).

However the general advice we've given to others are:

  1. Increase the number of CPUs to the concoct process
  2. Increase the wall time of both concoct (and presumably in your case, the main nextflow job) and be patient
    • in previous cases the tool has been running, just extremely slow. I don't know if that applies here
  3. Skip CONCOCT and rely on maxbin/metabat

Finally, @alexhbnr actually had found general problems with OpenBLAS on our (old, SGE) cluster... I don't think this is the same problem as you but you could still try

  1. set the number of OpenBLAS threads to 1 using an environment variable. I'll update this comment when I find the config example (I'm currently on my phone)

Edit: the relevant settings - https://github.com/nf-core/configs/blob/master/conf%2Fpipeline%2Fmag%2Feva.config#L7-L10

@alneberg
Copy link
Member

This is a frequently reported issue for CONCOCT actually. Please forgive my ignorance but I don't exactly recall the cause of it. I believe it has to do with how the openblas is compiled inside the concoct conda package. If you're really keen on using CONCOCT, you would have to try to create a container that does not have this issue. I believe the issue is easy enough to trigger for any small test run.

@Peter-Kille
Copy link
Author

Dear Both - thank you so much for your time to respond. I will probably skip the concoct step for now as suggested as the current data is rather large and test with smaller data set and report back.

I have been using core-nf/mag pipeline previous without the concoct step and it has worked really well - thanks so much for all your efforts in developing the pipeline they are very much appreciated :)

@jfy133 jfy133 mentioned this issue Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants