Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapy and Great Expectations: Error - __provides__ #6307

Closed
culpgrant opened this issue Apr 5, 2024 · 13 comments
Closed

Scrapy and Great Expectations: Error - __provides__ #6307

culpgrant opened this issue Apr 5, 2024 · 13 comments
Labels

Comments

@culpgrant
Copy link

Description

I am trying to use Scrapy and Great Expectations in the same virtual environment but there is an issue depending on the order I import the packages in.

I created an issue for Great Expectations with additional details.

They were mentioning it might be something with abc being monkey-patched.

Steps to Reproduce

This does work:

import great_expectations
import scrapy

This does not work:

import scrapy
import great_expectations

Error:

Traceback (most recent call last):
  File 

"/Users/grant/vs_code_projects/grants_projects/test_environment.py", line 2, in <module>
    import great_expectations
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/__init__.py", line 32, in <module>
    register_core_expectations()
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/expectations/registry.py", line 187, in register_core_expectations
    from great_expectations.expectations import core  # noqa: F401
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/expectations/core/__init__.py", line 1, in <module>
    from .expect_column_distinct_values_to_be_in_set import (
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/expectations/core/expect_column_distinct_values_to_be_in_set.py", line 12, in <module>
    from great_expectations.expectations.expectation import (
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/expectations/expectation.py", line 2350, in <module>
    class BatchExpectation(Expectation, ABC):
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/expectations/expectation.py", line 287, in __new__
    newclass._register_renderer_functions()
  File "/Users/grant/Envs/test_env/lib/python3.8/site-packages/great_expectations/expectations/expectation.py", line 369, in _register_renderer_functions
    attr_obj: Callable = getattr(cls, candidate_renderer_fn_name)
AttributeError: __provides__

Expected behavior: Be able to use the packages together in the same virtual environment

Actual behavior: Cannot import the packages together

Reproduces how often: 100%

Versions

Scrapy 2.11.1
great-expectations 0.18.12

Additional context

Looking for a possible solution on what could be done. Thank you!

@Gallaecio
Copy link
Member

@wRAR
Copy link
Member

wRAR commented Apr 5, 2024

Well, at least importing zope.interface (and twisted) instead of scrapy doesn't reproduce the error (I really hoped that will be the problem).

@wRAR wRAR added the bug label Apr 5, 2024
@VMRuiz
Copy link
Contributor

VMRuiz commented Apr 8, 2024

I was able to reproduce this issue by importing twisted.ssl.Certificate:

(great_expectations) ➜  scrapy git:(master) ✗ python
Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import twisted.internet.ssl
>>> import great_expectations
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/__init__.py", line 32, in <module>
    register_core_expectations()
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/registry.py", line 187, in register_core_expectations
    from great_expectations.expectations import core  # noqa: F401
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/__init__.py", line 1, in <module>
    from .expect_column_distinct_values_to_be_in_set import (
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/expect_column_distinct_values_to_be_in_set.py", line 12, in <module>
    from great_expectations.expectations.expectation import (
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 2350, in <module>
    class BatchExpectation(Expectation, ABC):
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 287, in __new__
    newclass._register_renderer_functions()
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 369, in _register_renderer_functions
    attr_obj: Callable = getattr(cls, candidate_renderer_fn_name)
AttributeError: __provides__. Did you mean: '__providedBy__'?

@VMRuiz
Copy link
Contributor

VMRuiz commented Apr 8, 2024

Importing Certificate directly from its internal package seems to work:

(great_expectations) ➜  scrapy git:(master) ✗ python
Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from twisted.internet._sslverify import Certificate
>>> import great_expectations

So it must be related to twisted.internet.ssl init code

@VMRuiz
Copy link
Contributor

VMRuiz commented Apr 8, 2024

I kept digging in Twisted code and the culprit seems to be the class BaseConnector(ABC) class at https://github.com/twisted/twisted/blob/1c80aad4c8fd2d0142433476bd5f6df5c511b4ba/src/twisted/internet/base.py#L1224

For some reason, the implementer decorator adds __provides__ to both BaseConnector and ABC classes:

>>> from zope.interface import classImplements, implementer
>>> from twisted.internet.interfaces import IConnector
>>> from abc import ABC
>>> @implementer(IConnector)
... class Test2(ABC):
...    pass
>>> import great_expectations
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/__init__.py", line 32, in <module>
    register_core_expectations()
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/registry.py", line 187, in register_core_expectations
    from great_expectations.expectations import core  # noqa: F401
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/__init__.py", line 1, in <module>
    from .expect_column_distinct_values_to_be_in_set import (
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/expect_column_distinct_values_to_be_in_set.py", line 12, in <module>
    from great_expectations.expectations.expectation import (
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 2350, in <module>
    class BatchExpectation(Expectation, ABC):
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 287, in __new__
    newclass._register_renderer_functions()
  File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 369, in _register_renderer_functions
    attr_obj: Callable = getattr(cls, candidate_renderer_fn_name)
AttributeError: __provides__. Did you mean: '__providedBy__'?```

@culpgrant
Copy link
Author

@VMRuiz Thank you for looking into this! Do you think I should create an issue with zope?

@VMRuiz
Copy link
Contributor

VMRuiz commented Apr 9, 2024

To be honest, I don't know if this is it a problem with Zope or a bad implementation by Twisted lib. @wRAR What do you think?

As a workaround for Scrapy, maybe could import from twisted.internet._sslverify import Certificate in the meantime to avoid these side effects? There is some risk of this breaking in the future but I wouldn't expect great changes from Twisted at this point.

@wRAR
Copy link
Member

wRAR commented Apr 9, 2024

My first thought was also "I don't know if this is it a problem with Zope or a bad implementation by Twisted lib", as I'm not familiar with the zope.interface internals.

@GeorgeA92
Copy link
Contributor

GeorgeA92 commented Apr 10, 2024

@culpgrant

Steps to Reproduce
This does work:

import great_expectations
import scrapy

If this works. What prevent You to just use this import order in Your task?

@GeorgeA92
Copy link
Contributor

Counthing great-expectations/great_expectations#9698 (comment) I think that this issue is not related to scrapy and it's root-cause is 100% in GreatExpectations codebase (it can be solved by adding simple try except block around line tha gave AttributeError).

@Rishika70

This comment was marked as duplicate.

@culpgrant
Copy link
Author

This was determined to be a great expectations - issue.

@Rishika70
Copy link

Rishika70 commented Apr 29, 2024

the real question is why is the import modifying a dependency instead of making a duplicate
and modifying the copy
if thats whats going on i think its bad practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants