Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with internal loops #4

Open
RiescoR opened this issue Nov 22, 2022 · 2 comments
Open

Problem with internal loops #4

RiescoR opened this issue Nov 22, 2022 · 2 comments

Comments

@RiescoR
Copy link

RiescoR commented Nov 22, 2022

I have an issue when running the pipeline with a personal comparison of my own genomes.
I used the proposed command line (python runner_personal.py -t 4 /absolute/path/to/folder/with/genes/) and in the "folder with genes" it creates a ¨_conspecific¨ folder, that contains a "database"and a ¨script" folder.
The problem is that the pipeline systematically creates copies of the master conspecifix folder within the ¨scripts" folder, and then tries to create again the same folder structure, giving an infinite master>folder with genes> script >master>folder with genes> script >...
It is copying the same folder again and again in the scripts directory, filling up the harddrive until the maximum writing path length is achieved or until the drive is full, giving an error prompt.
Is this how the pipeline is suposed to work?
Thanks!

@RiescoR
Copy link
Author

RiescoR commented Nov 22, 2022

Ok, I have found the problem. The error is located in the script runner_personal.py. shutil.copytree is resulting in an endless loop that is triggered by copying a directory into a subdirectory of itself (please see https://gitlab.com/apparmor/apparmor/-/issues/62)
The problem will cause a crash when the master directory has many files, as they will be copied recursively almost endlessly and deplete the disk available memory.
The solution is to omit the _conspecifi folder when the script folder is created. I've made the following changes to the runner_personal.py script that resolved the problem (see lines 5, and 53 of the attached code). If you have the same problem you can directly replace the content of the runner_personal.py file with the attached code.

runner_database.zip

@RiescoR
Copy link
Author

RiescoR commented Dec 7, 2022

Actually, the best way to proceed if you do not want to change anything is just to have your genomic data outside of the conspecifix folder, that will prevent the copy loop.
Usearch also gives problems if you place the binary on the conspecifix folder, always place it outside, ideally in your $PATH folder.

@RiescoR RiescoR changed the title copy of the conspecific master folder in personal run Problem with internal loops Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant