Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError in pairtools split #145

Closed
MyBiscuit opened this issue Sep 27, 2022 · 8 comments
Closed

IndexError in pairtools split #145

MyBiscuit opened this issue Sep 27, 2022 · 8 comments

Comments

@MyBiscuit
Copy link

I am getting a fatal error when using pairtools split (in split.py, I believe):
IndexError: pop index out of range

In the log file, I also found another issue:

ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

I am not an expert in python so don't know how to proceed. Is the IndexError related to this error?

@agalitsyna
Copy link
Member

agalitsyna commented Sep 27, 2022

Hi, @BeetleMilk , before we look deeper into this, please, report your pairtools version and example of the file that you use.

Are you sure this file is appropriate pairs file with header and columns? See the specification here: https://pairtools.readthedocs.io/en/latest/formats.html
I will also try to make sure the integrity of the file, e.g. that it is not corrupted at the end of file (e.g. some columns are accidentally missing in the last line.

@MyBiscuit
Copy link
Author

I tried to check the version (pairtools --version) and this is what I got:

ImportError: this version of pandas is incompatible with numpy < 1.20.3
your numpy version is 1.19.2.
Please upgrade numpy to >= 1.20.3 to use this pandas version

So the problem must be in numpy version?

My input file is a .pairsam file. I think it is correct. File size: 315Gb.

$ head dedup.pairsam
## pairs format v1.0.0
#sorted: chr1-chr2-pos1-pos2
#shape: upper triangle
#genome_assembly: unknown

@agalitsyna
Copy link
Member

Looks like multiple packages in your environment is outdated.
If you use conda, you may upgrade everything by running:
conda update --all
(which will upgrade everything in your environment and might be harmful for non-pandas/numpy/pairtools packages)

If you prefer pip, then the solution will be more focused and safe:

git clone https://github.com/open2c/pairtools
cd pairtools
pip install -r requirements.txt

The latter will install the most recent versions of pairtools dependencies.

If you don't want to harm your working environment accidentally, I would start a new conda environment for pairtools and install all the dependencies with pairtools there and try running split again.

@MyBiscuit
Copy link
Author

I tried this, but got an error

ERROR: Could not find a version that satisfies the requirement scipy>=1.7.0 (from versions: 0.8.0, 0.9.0, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.12.1, 0.13.0, 0.13.1, 0.13.2, 0.13.3, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.16.0, 0.16.1, 0.17.0, 0.17.1, 0.18.0, 0.18.1, 0.19.0, 0.19.1, 1.0.0, 1.0.1, 1.1.0, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 1.3.0rc1, 1.3.0rc2, 1.3.0, 1.3.1, 1.3.2, 1.3.3, 1.4.0rc1, 1.4.0rc2, 1.4.0, 1.4.1, 1.5.0rc1, 1.5.0rc2, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.5.4)
ERROR: No matching distribution found for scipy>=1.7.0

I tried installing an updated scipy, but failed

$ pip install --user scipy==1.9.1
ERROR: Could not find a version that satisfies the requirement scipy==1.9.1 (from versions: 0.8.0, 0.9.0, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.12.1, 0.13.0, 0.13.1, 0.13.2, 0.13.3, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.16.0, 0.16.1, 0.17.0, 0.17.1, 0.18.0, 0.18.1, 0.19.0, 0.19.1, 1.0.0, 1.0.1, 1.1.0, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 1.3.0rc1, 1.3.0rc2, 1.3.0, 1.3.1, 1.3.2, 1.3.3, 1.4.0rc1, 1.4.0rc2, 1.4.0, 1.4.1, 1.5.0rc1, 1.5.0rc2, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.5.4)
ERROR: No matching distribution found for scipy==1.9.1

Do you have advice on how to update scipy?

@agalitsyna
Copy link
Member

Hi @BeetleMilk, some of the packages in your environment are outdated. It might be that python version is too old or some other dependency. Installing conda and creating fresh environment will resolve this. Even if you don't use it, you can follow their online manual, it's pretty straightforward.

@agalitsyna
Copy link
Member

Hi @BeetleMilk, we've spotted one of the reasons for the popping bug during splitting. Here is the fix: 16ffd69
Feel free to try out the most recent GitHub version in your environment! (All the warnings about outdated packages turned out to be misleading, it should not be a problem)

@MyBiscuit
Copy link
Author

Creating a new environment fixed the issue, thank you.
Are you going to release a new pairtools version with the bugfix?

@agalitsyna
Copy link
Member

Yes, this fix will be included in the next release, but I'm not sure this change is big enough to create the whole release. Feel free to propose new pairtools features here: #139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants