Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] hCaptcha detect fails due to new hCaptcha URL format #881

Open
vetinary opened this issue Mar 17, 2024 · 0 comments
Open

[Bug] hCaptcha detect fails due to new hCaptcha URL format #881

vetinary opened this issue Mar 17, 2024 · 0 comments
Labels
issue: bug report A bug has been reported needs triage

Comments

@vetinary
Copy link

The plugin works great with reCAPTCHA, however it throws an error on pages with hCaptcha

After some investigations, I came to the following problem:

There is a block of code inside the plugin, where hCaptcha parameters are extracted:

_extractInfoFromIframes(iframes) {
        return iframes
            .map(el => el.src.replace('.html#', '.html?'))
            .map(url => {
            const { searchParams } = new URL(url);
            const result = {
                _vendor: 'hcaptcha',
                url: document.location.href,
                id: searchParams.get('id'),
                sitekey: searchParams.get('sitekey'),
                display: {
                    size: searchParams.get('size') || 'normal'
                }
            };
            return result;
        });
    }

The hCaptcha iframe URL has the following format:

https://newassets.hcaptcha.com/captcha/v1/c44fc00/static/hcaptcha.html?_v=h8ew9h1l07#frame=challenge&id=0t7tnh8gx2un&host=mysite.com&sentry=undefined&reportapi=https%3A%2F%2Faccounts.hcaptcha.com&recaptchacompat=true&custom=false&tplinks=on&pstissuer=https%3A%2F%2Fpst-issuer.hcaptcha.com&sitekey=cf0b9a27-82e3-42fb-bfec-562f8045e495&size=invisible&theme=light&origin=https%3A%2F%2Fmysite.com

Since there is no substring .html# (html is followed by ?_v=…) the URL stays unmodified, and parameters like id, sitekey and size can't be extracted from the query string.

As a result, in logs I get message:

PuppeteerExtraPluginRecaptcha: An error occured during "getRecaptchaSolutions": {
  _vendor: 'hcaptcha',
  provider: '2captcha',
  error: 'Error: Missing data in captcha'
}

I think, the quick workaround colud be something like:
if there is .html? in iframe URL, just replace '#' with '&', which will make _v a common GET-parameter, otherwise replace .html# with .html?

@vetinary vetinary added issue: bug report A bug has been reported needs triage labels Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue: bug report A bug has been reported needs triage
Projects
None yet
Development

No branches or pull requests

1 participant