Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teach making requests in a consistent way #950

Open
honzajavorek opened this issue Apr 26, 2024 · 0 comments
Open

Teach making requests in a consistent way #950

honzajavorek opened this issue Apr 26, 2024 · 0 comments
Assignees
Labels
academy Issues related to Web Scraping and Apify academies. enhancement New feature or request.

Comments

@honzajavorek
Copy link
Collaborator

The courses now recommend using variety of tools to make HTTP requests. Sometimes it's more confusing, sometimes less.

  • The got library seems to be superseeded by ky, at least in the README Sindre mentions it.
  • Apify develops got-scraping, which seems to be married to the got library. Wouldn't it make sense to turn got-scraping into something more agnostic to a client library? Could it just prepare the request details, so that they can be attached to the request by any library?
  • Some guides use axios
  • Some guides mention request and request-promise, which are now both deprecated
  • Meanwhile, Node.js has adopted fetch to the stdlib

Using got-scraping in the basic tutorial is probably unnecessary, any HTTP client can be used in the initial lessons. The value of got-scraping should emerge with more complicated use cases.

But wouldn't it make more sense to skip got-scraping and promote Crawlee right away at that point? Is got-scraping something Apify wants to spend marketing energy on, or is it an implementation detail?

As of now, got-scraping doesn't have good Python alternatives I'd know about. There are independent libraries one can use, such as fake-user-agent, which have integrations with scraping frameworks.

Regarding request libraries, the scene is similarly shattered in Python, featuring requests, aiohttp, or httpx, each having their fans and use cases.

I'd like to kick off this as a discussion on what should be the preferred way for the Academy to teach making requests in 2024, using Node.js and Python.

@honzajavorek honzajavorek added the academy Issues related to Web Scraping and Apify academies. label Apr 26, 2024
@honzajavorek honzajavorek self-assigned this Apr 26, 2024
@honzajavorek honzajavorek added the enhancement New feature or request. label Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
academy Issues related to Web Scraping and Apify academies. enhancement New feature or request.
Projects
None yet
Development

No branches or pull requests

1 participant