Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve typing for Spider.parse(). #6274

Merged
merged 4 commits into from May 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion scrapy/commands/bench.py
Expand Up @@ -60,6 +60,6 @@ def start_requests(self) -> Iterable[Request]:
url = f"{self.baseurl}?{urlencode(qargs, doseq=True)}"
return [scrapy.Request(url, dont_filter=True)]

def parse(self, response: Response) -> Any: # type: ignore[override]
def parse(self, response: Response) -> Any:
for link in self.link_extractor.extract_links(response):
yield scrapy.Request(link.url, callback=self.parse)
19 changes: 14 additions & 5 deletions scrapy/spiders/__init__.py
Expand Up @@ -17,12 +17,17 @@
from scrapy.utils.url import url_is_from_spider

if TYPE_CHECKING:
from collections.abc import Callable

Check warning on line 20 in scrapy/spiders/__init__.py

View check run for this annotation

Codecov / codecov/patch

scrapy/spiders/__init__.py#L20

Added line #L20 was not covered by tests

# typing.Concatenate requires Python 3.10
# typing.Self requires Python 3.11
from typing_extensions import Self
from typing_extensions import Concatenate, Self

Check warning on line 24 in scrapy/spiders/__init__.py

View check run for this annotation

Codecov / codecov/patch

scrapy/spiders/__init__.py#L24

Added line #L24 was not covered by tests

from scrapy.crawler import Crawler
from scrapy.settings import BaseSettings

CallbackT = Callable[Concatenate[Response, ...], Any]

Check warning on line 29 in scrapy/spiders/__init__.py

View check run for this annotation

Codecov / codecov/patch

scrapy/spiders/__init__.py#L29

Added line #L29 was not covered by tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be useful in other places as it's a general type for a spider callback. I didn't think about that further though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should use it for Request.callback

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can't, probably because of wrong covariance when assigning a specific callback to a request, I need to investigate this separately.



class Spider(object_ref):
"""Base class for scrapy spiders. All spiders must inherit from this
Expand Down Expand Up @@ -79,10 +84,14 @@
def _parse(self, response: Response, **kwargs: Any) -> Any:
return self.parse(response, **kwargs)

def parse(self, response: Response, **kwargs: Any) -> Any:
raise NotImplementedError(
f"{self.__class__.__name__}.parse callback is not defined"
)
if TYPE_CHECKING:
parse: CallbackT

Check warning on line 88 in scrapy/spiders/__init__.py

View check run for this annotation

Codecov / codecov/patch

scrapy/spiders/__init__.py#L88

Added line #L88 was not covered by tests
else:

def parse(self, response: Response, **kwargs: Any) -> Any:
raise NotImplementedError(
f"{self.__class__.__name__}.parse callback is not defined"
)

@classmethod
def update_settings(cls, settings: BaseSettings) -> None:
Expand Down