Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: get an error in pycytominer/__init__.py it was fixed coming back to the previous commit #414

Open
PaulaLlanos opened this issue May 10, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@PaulaLlanos
Copy link

Example code with output

I get an error when I try to use pycytominer/__init__.py in the cluster.

code:

cd ~/ebs_tmp/${PROJECT_NAME}/workspace/software/pycytominer/
mkdir -p ../../log/${BATCH_ID}/
parallel
--max-procs ${MAXPROCS}
--ungroup
--eta
--joblog ../../log/${BATCH_ID}/collate.log
--results ../../log/${BATCH_ID}/collate
--files
--keep-order
python3 pycytominer/cyto_utils/collate_cmd.py ${BATCH_ID} pycytominer/cyto_utils/database_config/ingest_config.ini {1}
--tmp-dir ~/ebs_tmp
--aws-remote=s3://${BUCKET}/projects/${PROJECT_NAME}/workspace :::: ${PLATES}

error:

Traceback (most recent call last):
  File "pycytominer/cyto_utils/collate_cmd.py", line 2, in <module>
    from pycytominer.cyto_utils.collate import collate
  File "/home/ubuntu/ebs_tmp/2019_08_03_Adipocyte_CellPainting_Claussnitzer/workspace/software/pycytominer/pycytominer/__init__.py", line 1, in <module>
    from pycytominer import __about__, __config__
  File "/home/ubuntu/ebs_tmp/2019_08_03_Adipocyte_CellPainting_Claussnitzer/workspace/software/pycytominer/pycytominer/__config__.py", line 9, in <module>
    pd.options.mode.copy_on_write = True
  File "/home/ubuntu/.pyenv/versions/3.8.13/lib/python3.8/site-packages/pandas/_config/config.py", line 221, in __setattr__
    raise OptionError("You can only set the value of existing options")
pandas._config.config.OptionError: 'You can only set the value of existing options'

Issue description

the error was pasted above

Expected behavior

works deleting just " __ config__ " from the first line of the file __init__.py

Screenshot 2024-04-30 at 9 26 42 AM

Additional information

this error came from the last commit merged:

62eec0b

@PaulaLlanos PaulaLlanos added the bug Something isn't working label May 10, 2024
@kenibrewer
Copy link
Member

Hi Paula,

Could you let us know what version of pandas you have installed in your virtual environment?

This is a future facing change we made to prepare for pandas 3.0.0 but we might need to increase the minimum version of pandas specified for our project to ensure this option is included.

CC @d33bs

@d33bs
Copy link
Member

d33bs commented May 12, 2024

Thanks Paula and Ken! I opened a draft PR for a fix for this which should be able to check for the configuration option before attempting to apply it. As Ken mentioned, we should still verify the Pandas version before making these changes (to make sure that the conditional also is appropriate).

As an aside, just in case it's helpful, please note that Python 3.8 is reaching end-of-life later this year around October (see here for more).

@PaulaLlanos
Copy link
Author

thank you, everyone!
The pandas version that I installed is version 2.2.0.
Here are the versions of the rest of my dependencies:

agate==1.9.1
agate-dbf==0.2.2
agate-excel==0.4.1
agate-sql==0.7.2
Babel==2.14.0
backports.tempfile==1.0
backports.weakref==1.0.post1
boltons @ file:///work/ci_py311/boltons_1677685195580/work
brotlipy==0.7.0
certifi @ file:///croot/certifi_1683875369620/work/certifi
cffi @ file:///work/ci_py311/cffi_1676822533496/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.7
conda @ file:///croot/conda_1689269889729/work
conda-content-trust @ file:///work/ci_py311/conda-content-trust_1676851327497/work
conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1685032319139/work/src
conda-package-handling @ file:///croot/conda-package-handling_1685024767917/work
conda_package_streaming @ file:///croot/conda-package-streaming_1685019673878/work
configparser==6.0.0
cryptography @ file:///croot/cryptography_1686613057838/work
csvkit==1.3.0
cytominer-database==0.3.4
dbfread==2.0.7
et-xmlfile==1.1.0
greenlet==3.0.3
idna @ file:///work/ci_py311/idna_1676822698822/work
isodate==0.6.1
joblib==1.3.2
jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work
jsonpointer==2.1
leather==0.3.4
libmambapy @ file:///croot/mamba-split_1685993156657/work/libmambapy
numpy==1.26.3
olefile==0.47
openpyxl==3.1.2
packaging @ file:///croot/packaging_1678965309396/work
pandas==2.2.0
parsedatetime==2.6
pluggy @ file:///work/ci_py311/pluggy_1676822818071/work
pyarrow==15.0.0
pycosat @ file:///work/ci_py311/pycosat_1676838522308/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
-e git+https://github.com/cytomining/pycytominer.git@00d8f0b49021db7d9a6b7da570bc7a98cc098764#egg=pycytominer
pyOpenSSL @ file:///croot/pyopenssl_1678965284384/work
PySocks @ file:///work/ci_py311/pysocks_1676822712504/work
python-dateutil==2.8.2
python-slugify==8.0.1
pytimeparse==1.1.8
pytz==2023.3.post1
requests @ file:///croot/requests_1682607517574/work
ruamel.yaml @ file:///work/ci_py311/ruamel.yaml_1676838772170/work
scikit-learn==1.4.0
scipy==1.12.0
six @ file:///tmp/build/80754af9/six_1644875935023/work
SQLAlchemy==1.4.51
text-unidecode==1.3
threadpoolctl==3.2.0
toolz @ file:///work/ci_py311/toolz_1676827522705/work
tqdm @ file:///croot/tqdm_1679561862951/work
typing_extensions==4.9.0
tzdata==2023.4
urllib3 @ file:///croot/urllib3_1686163155763/work
xlrd==2.0.1
zstandard @ file:///work/ci_py311_2/zstandard_1679339489613/work

@d33bs
Copy link
Member

d33bs commented May 13, 2024

Thanks @PaulaLlanos ! Based on what you mentioned I looked into Pandas 2.2.0 to find out more about the config option availability. Pandas 2.2.0 I believe makes the the copy_on_write configuration option available here.

I noticed the logs you shared on opening this issue included a reference to a Python 3.8 environment managed by pyenv: /home/ubuntu/.pyenv/versions/3.8.13/lib/python3.8/.... It could be that your conda or pyenv environments have somehow become decoupled or misconfigured.

When you have the chance, could you confirm whether you use conda and pyenv? Also, does the conda environment relate to the same environment used within your parallel ... python3 command? A quick way to double check might be to use something like parallel ... python3 -m pip freeze > deps.txt.

@PaulaLlanos
Copy link
Author

that is correct, I am using penv to create my environment. I run the command and it look that parallel is using the same dependencies that my environment, see below.
I didn't try yet to see if I am able to run now, but seems that should be fine now. Thank you a lot again and I will let you know if I am facing another issue.

agate==1.9.1
agate-dbf==0.2.2
agate-excel==0.4.1
agate-sql==0.7.2
Babel==2.14.0
backports.tempfile==1.0
backports.weakref==1.0.post1
boltons @ file:///work/ci_py311/boltons_1677685195580/work
brotlipy==0.7.0
certifi @ file:///croot/certifi_1683875369620/work/certifi
cffi @ file:///work/ci_py311/cffi_1676822533496/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.7
conda @ file:///croot/conda_1689269889729/work
conda-content-trust @ file:///work/ci_py311/conda-content-trust_1676851327497/work
conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1685032319139/work/src
conda-package-handling @ file:///croot/conda-package-handling_1685024767917/work
conda_package_streaming @ file:///croot/conda-package-streaming_1685019673878/work
configparser==6.0.0
cryptography @ file:///croot/cryptography_1686613057838/work
csvkit==1.3.0
cytominer-database==0.3.4
dbfread==2.0.7
et-xmlfile==1.1.0
greenlet==3.0.3
idna @ file:///work/ci_py311/idna_1676822698822/work
isodate==0.6.1
joblib==1.3.2
jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work
jsonpointer==2.1
leather==0.3.4
libmambapy @ file:///croot/mamba-split_1685993156657/work/libmambapy
numpy==1.26.3
olefile==0.47
openpyxl==3.1.2
packaging @ file:///croot/packaging_1678965309396/work
pandas==2.2.0
parsedatetime==2.6
pluggy @ file:///work/ci_py311/pluggy_1676822818071/work
pyarrow==15.0.0
pycosat @ file:///work/ci_py311/pycosat_1676838522308/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
-e git+https://github.com/cytomining/pycytominer.git@00d8f0b49021db7d9a6b7da570bc7a98cc098764#egg=pycytominer
pyOpenSSL @ file:///croot/pyopenssl_1678965284384/work
PySocks @ file:///work/ci_py311/pysocks_1676822712504/work
python-dateutil==2.8.2
python-slugify==8.0.1
pytimeparse==1.1.8
pytz==2023.3.post1
requests @ file:///croot/requests_1682607517574/work
ruamel.yaml @ file:///work/ci_py311/ruamel.yaml_1676838772170/work
scikit-learn==1.4.0
scipy==1.12.0
six @ file:///tmp/build/80754af9/six_1644875935023/work
SQLAlchemy==1.4.51
text-unidecode==1.3
threadpoolctl==3.2.0
toolz @ file:///work/ci_py311/toolz_1676827522705/work
tqdm @ file:///croot/tqdm_1679561862951/work
typing_extensions==4.9.0
tzdata==2023.4
urllib3 @ file:///croot/urllib3_1686163155763/work
xlrd==2.0.1
zstandard @ file:///work/ci_py311_2/zstandard_1679339489613/work

@PaulaLlanos
Copy link
Author

Hello again, I just wanted to inform you that a collegue else encountered the same issue on Monday. Thank you for your attention to this matter. I will let you know if someone else is facing the same issue here.

@d33bs
Copy link
Member

d33bs commented May 14, 2024

Thank you @PaulaLlanos for the updates and sorry to hear this is impacting another person. Based on what you've mentioned I've moved #415 out of draft status and will seek review to help avoid this in the future.

Something which concerns me about the environment you mention is that Pandas 2.2.0 is incompatible with Python 3.8 (see here)(I gathered Python 3.8 from your original error message posted in this issue). It makes me wonder if the parallel command you're using may be spawning processes which don't have access to the activated conda environment, and as a result attempt to use the system-available python3 (as well as any installed packages there). One way to be sure about this would be to use the absolute path for python3 (i.e. found through which python3) to be sure you're specifying "which Python environment" for the parallel procedures.

@kenibrewer
Copy link
Member

I approved @d33bs fix and we'll get that merged as soon as possible. I agree with his assessment that something about your setup means that the dependency versions you reported likely aren't the ones that are being run and causing the error.

@d33bs
Copy link
Member

d33bs commented May 14, 2024

Thanks @kenibrewer !

@PaulaLlanos , a fix has been applied as of f3ce792 which I believe should mitigate the exception you mentioned. Just the same, I recommend double checking your environment to ensure you're using the Python package versions you need for your work (differences from Pandas 1.x.x - 2.x.x could impact data results, for example).

@PaulaLlanos
Copy link
Author

Oh, thank you all very much. And I'm sorry for the misunderstanding; I thought the error had already been merged on Monday, which is why I mentioned someone else having the same issue.

Regarding the dependencies, we typically run those commands as part of our cell painting assay, and we haven't encountered this issue before. However, since you mentioned that some changes are coming in the Python/Pandas version, we will double-check the environment and update accordingly. Thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants