Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google translate does not work with gatsby sites #6300

Closed
rdcw opened this issue Jul 4, 2018 · 33 comments
Closed

Google translate does not work with gatsby sites #6300

rdcw opened this issue Jul 4, 2018 · 33 comments
Labels
help wanted Issue with a clear description that the community can help with. type: bug An issue or pull request relating to a bug in Gatsby

Comments

@rdcw
Copy link

rdcw commented Jul 4, 2018

Description

When opening a gatsby site in google translate, it shows a flash of translated content and then shows a 404 page.

E.g.
https://translate.google.com/translate?sl=auto&tl=fr&js=y&hl=en&u=https%3A%2F%2Freactjs.org

or

https://translate.google.com/translate?sl=auto&tl=fr&js=y&hl=en&u=https%3A%2F%2Fabout.sourcegraph.com

Steps to reproduce

Open any gatsby page in google translate.

Expected result

To show the page with the translated text.

Actual result

A 404 page is shown.

Environment

Production

@m-allanson
Copy link
Contributor

Huh. Google translate re-hosts the site (with translated content) in an iframe with a URL like https://translate.google.com/translate?hl=en&sl=en&tl=fr&u=about.sourcegraph.com.

Gatsby doesn't know it's being hosted on a different domain and tries to load the content for the /translate page, which doesn't exist, so it then shows the 404 page instead.

I've marked this as a bug but I'm not sure what the fix would be. Maybe Gatsby shouldn't load the 404 page if the initial SSR render isn't the 404 page?

@m-allanson m-allanson added type: bug An issue or pull request relating to a bug in Gatsby help wanted Issue with a clear description that the community can help with. labels Jul 6, 2018
@KyleAMathews
Copy link
Contributor

That sounds like it could be a reasonable fix. Check if a SSRed page is loaded and trust it over the client URL.

@jdfm
Copy link

jdfm commented Jul 30, 2018

This also seems to be an issue with google's webcache, as the url ends up looking like: http://webcache.googleusercontent.com/search?q=cache:[GOOGLE_CACHE_KEY]:[YOUR_CONTENTS_ENDPOINT][GOOGLE_APPENDED_QUERY_VARIABLES]

Which then causes it to redirect to a 404 page and output to the console that A page wasn't found for "/search".

I would imagine this is going to happen for any service that acts as a sort of proxy to the content created through Gatsby.

Currently using Gatsby v1.9.273.

@jdfm
Copy link

jdfm commented Jul 30, 2018

@m-allanson

Have you tried accessing your endpoint on google webcache? I tried it a few times with the example you posted (about.sourcegraph.com) and it seems to be working, but trying to access the same endpoint via google translate redirects to the 404 page. Whereas in my case, it redirects to 404 consistently on both.

EDIT: Now that I think about it, what I was seeing was probably the /search endpoint for that site. So please disregard the comments above and assume the behavior is consistently broken for sites without those endpoints.

@KyleAMathews
Copy link
Contributor

Dan Abramov came up with a work around for this — reactjs/react.dev#1148

@gaearon
Copy link

gaearon commented Aug 31, 2018

To be clear my workaround is for the crash when using the Translate extension. I haven’t looked into the URL issue but we’d need to solve it too. Ideas?

@KyleAMathews
Copy link
Contributor

@gaearon oh hmm yeah — so fixing that would mean Gatsby needs to support alt URL patterns where some other software has taken control of the URL. Seems doable.

@ryota-murakami
Copy link
Contributor

Hi All, from Dan Twitter. https://twitter.com/dan_abramov/status/1035575858843578369

Oh, Page Not Found 😧

@ryota-murakami
Copy link
Contributor

ryota-murakami commented Aug 31, 2018

@KyleAMathews @gaearon

Console say A page wasn't found for "/translate_c"
Perhaps Google Translate attempt to try access /translate_c at inner JavaScript.

video

https://www.dropbox.com/s/bfy2t4kc31smc7d/google-react-gatsby.mp4?dl=0

@ryota-murakami
Copy link
Contributor

findPage() appears to be the cause.
How to handle it😰

  • loader.js
      const page = findPage(path)

      if (!page) {
        handleResourceLoadError(path, `A page wasn't found for "${path}"`)

@KyleAMathews
Copy link
Contributor

@ryota-murakami thanks for looking into this! The logic for finding pages is in https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby/cache-dir/find-page.js

@ryota-murakami
Copy link
Contributor

@KyleAMathews I'm afraid super slow response🙇‍♂️
I tried fix the Issue, however currently 404 page doesn't show(whiteout browser screen instead) because following change in johncmunson/gatsby@224f6a8.
https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby/cache-dir/loader.js#L315-L317

As far as I reed commit message that change has been have a different purpose(preload 404) but that is affecting this Issue.

I suppose that fix approach might be better if build by gatsby website loaded from <iframe>, cut off packages/gatsby/cache-dir/loader.js's enter logic because actual url path is not original gatsby website's.

What do you think @KyleAMathews ?
Still I don't understand gatsby entire mechanism, each package role though 🤔

pieh pushed a commit that referenced this issue Oct 22, 2018
# Overview
I've tried fixing #6300, I  encountered 2 ESLInt error.
In original Google Translate Issue, I'm considering how approach to fix🤔

<img width="1241" alt="screen shot 2018-10-11 at 2 15 14" src="https://user-images.githubusercontent.com/5501268/46754799-5125ee80-ccfe-11e8-81dd-646a3bd74dee.png">
gpetrioli pushed a commit to gpetrioli/gatsby that referenced this issue Jan 22, 2019
# Overview
I've tried fixing gatsbyjs#6300, I  encountered 2 ESLInt error.
In original Google Translate Issue, I'm considering how approach to fix🤔

<img width="1241" alt="screen shot 2018-10-11 at 2 15 14" src="https://user-images.githubusercontent.com/5501268/46754799-5125ee80-ccfe-11e8-81dd-646a3bd74dee.png">
@gatsbot gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Jan 24, 2019
@gatsbot
Copy link

gatsbot bot commented Jan 24, 2019

Old issues will be closed after 30 days of inactivity. This issue has been quiet for 20 days and is being marked as stale. Reply here or add the label "not stale" to keep this issue open!

@gatsbot
Copy link

gatsbot bot commented Feb 4, 2019

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

Thanks again for being part of the Gatsby community!

@gatsbot gatsbot bot closed this as completed Feb 4, 2019
@jlengstorf
Copy link
Contributor

This isn't fixed, so I'm reopening it.

@jlengstorf jlengstorf reopened this Feb 8, 2019
@jlengstorf jlengstorf added not stale and removed stale? Issue that may be closed soon due to the original author not responding any more. labels Feb 8, 2019
@alexlouden
Copy link
Contributor

I'm hitting this issue with gatsby v1 - would it be okay for me to offer a $50 USD bounty for someone to solve it?

@heyflorin
Copy link
Contributor

Just came across this issue. At first I thought it was a CORS issue, which I resolved by setting the following settings for gatsby-plugin-netlify:

    {
      resolve: `gatsby-plugin-netlify`,
      options: {
        headers: {
          '/*': [
            'Access-Control-Allow-Origin: https://translate.googleusercontent.com',
            'Access-Control-Allow-Credentials: true',
            'Content-Security-Policy: frame-ancestors https://translate.google.com',
            'X-Frame-Options: ALLOW-FROM https://translate.google.com',
          ],
        },
        mergeSecurityHeaders: false,
      },
    },

Now I'm seeing a 404 (in the console, not on the front end).

GET https://southsoundymca.jayray.com/page-data/translate_c/page-data.json 404

This seems to reflect the issue above. Digging into it further it looks like all Gatsby sites are affected and as such, are not able to be translated by google translate. Anyone know if this is actively being worked on? 🤔

@broeker
Copy link

broeker commented Jul 10, 2019

We have yet to attempt an install but definitely hope there is a path forward -- Google Translate is a pretty amazing option for clients who don't have the resources do full translations in multiple languages, and it seems perfectly in line with the philosophy behind Gatsby and the JAMstack (so much so that it didn't even occur to me that it would not work, and sadly, I'm finding this out after it is already in the scope-of-work for a current Gatsby project :)

Our team is fairly new to Gatsby and React so I don't understand the internals well enough to jump right in, but if anybody IS working on this and has anything to share or that we can contribute to please let us know, and we will do our best to help as we can.

@heyflorin
Copy link
Contributor

@broeker, I haven't gotten any answer on this. Let's team up and figure it out? It sounds like we're in a similar situation.

@moonmeister
Copy link
Contributor

moonmeister commented Jul 14, 2019

@florinme FYI: Adding your recommend config on my site fixes the white page issue but I don't get a 404, the content flashes with the translation before reverting back to the original language. Maybe I'm on a newer version of Gatsby (I'm at the latest)?

https://5d2ab3e25c12e1d91c1707db--moonmeister-personal.netlify.com/

UPDATE: NVM, I'm seeing it in the console same as you.

@JayRayDeveloper
Copy link

@moonmeister @broeker had any luck with a workaround? I'm totally in a jam and don't know what to do...

@moonmeister
Copy link
Contributor

moonmeister commented Sep 21, 2019

Okay, so I've done some work on this and understand what's happening better. There are two issues causing problems for sites. The first is unrelated to Gatsby and is a security issue, the second is Gatsby re-hydrating our static pages.

UPDATE: The original issue reported was actually an errant 404 page and path issue(A page wasn't found for "/translate_c"). I have not seen this on my site, gatsbyjs.org, or reactjs.org. Not sure if Google changed something or if this got fixed in Gatsby somewhere, but it doesn't seem to be part of the larger issue any longer.

1. Security Headers

Translating a site like reactjs.org (yes, they use gatsby) results in the error:
Load denied by X-Frame-Options: https://reactjs.org/?depth=1&hl=en&rurl=translate.google.com…259,15700262,15700265&usg=ALkJrhgfSw3S-344F8stcoY_Wcxu2-llPA does not permit framing.

This has nothing to do with Gatsby and everything to do with the security headers set on your web server for CORS and CSP. @florinme has correctly identified the correct settings above. This is probably a more common issue with Gatsby sites; because, newer hosting providers like Netlify, which are commonly used with static asset sites, wisely use CSP to block sites from being embedded in iframes.

I'll give a quick explanation of the 4 settings indicated above:

CORS

CORS blocks examplea.com from loading content from exampleb.org. To allow cross domain content loading, we must set Response Headers on the web server. So, for Google Translate to load our resources in an iframe we need our hosting provider to set these two headers:

Access-Control-Allow-Origin: https://translate.googleusercontent.com
Access-Control-Allow-Credentials: true

CSP

The CSP Response Header tells the browser what can be done with the page being loaded. Loading a site into an iframe is a potential security risk and thus some hosting providers block this by default. Modifying this header allows only Google Translate to load the site:

Content-Security-Policy: frame-ancestors https://translate.google.com

X-Frame-Options

Before CSP was a Standard HTTP Response Header we had a convention to block loading content in iframes, it was X-Frame-Options. THis is ignored if a browser supports CSP, but is useful to ensure backwards compatibility of older browsers(mostly IE).

 X-Frame-Options: ALLOW-FROM https://translate.google.com

2. Gatsby Magic

Part of Gatsby's mystic is how we ship static content that gets re-hydrated into a fully client-side react app. Cool right? well it's causing an issue. Google Translate is loading a pages static html and translating that, about the time it's finished React swoops in and turns your translated content into a react app and overwrites it all with the content(in the original language) from the page's page-data.json. Because Google Translate thinks it's job is done it moves on and doesn't re-translate the page.

Conclusion

Fixing problem 1 fixes Google Translate showing a blank white screen. Problem 2 causes the flash of translated content reverting back to the original language.

Unfortunately I have no idea how to fix this. If we somehow disabled the Gatsby's re-hydration of the react app we'd be be disabling site functionality (though pausing JS execution via the debugger does show this would work for translation). What really needs to happen is we need to let Google Translate know the page isn't actually finished loading, or that it has been "reloaded"

I don't know if this means dispatching the correct event or if it's not even possible.

@urielhdz
Copy link
Contributor

urielhdz commented Oct 4, 2019

This can be solved by disabling the client side router.

You can do that by using this plugin: https://www.gatsbyjs.org/packages/@wardpeet/gatsby-plugin-static-site/ (currently broken).

There is a fork that works with the current latest version of Gatsby that you can found here: https://www.npmjs.com/package/@xavivars/gatsby-plugin-static-site

@urielhdz
Copy link
Contributor

urielhdz commented Oct 4, 2019

I uploaded a working example just to show up the solution, the original page is in spanish, and this link displays it translated to english using Google Translate.

@urielhdz
Copy link
Contributor

urielhdz commented Oct 4, 2019

I think that this issue could lead to some frustration so I don't know if it is a good idea to have this documented somewhere @marcysutton, I'm thinking on something like a guide on how to disable the client side router and in which cases you could do that 🤔

Aside from this issue, in #4337 there is a discussion with other potential use cases to disable the router.

@moonmeister
Copy link
Contributor

@urielhdz Thanks for the Info, yes this will fix the issue I identified. It'd be nice if there was a way to maintain CSR and make this work.

@heyflorin
Copy link
Contributor

Thanks to both @moonmeister and @urielhdz for the work on this! What are the implications of removing the router, if we disable it for translating, navigation will stop, correct? Can we disable routing temporarily while the user is viewing a translated page?

@urielhdz
Copy link
Contributor

I can confirm that navigation still works,

In regards of the downsides, I think that you lose the speed and performance benefits of using a client side router, so for example, every page transition requires a roundtrip to the server.

@dlbnco
Copy link
Contributor

dlbnco commented Oct 15, 2019

Good question @florinme I was asking myself the same.

In regards of the downsides, I think that you lose the speed and performance benefits of using a client side router, so for example, every page transition requires a roundtrip to the server.

@urielhdz Thanks for clarifying.

I would set a separate instance of the website with this setting in a separate domain, something like translate.domain.com.

@muescha
Copy link
Contributor

muescha commented Apr 22, 2020

gatsbyjs.org is still not translatable :(

https://translate.google.de/translate?sl=auto&tl=de&u=gatsbyjs.org%2Fdocs

@moonmeister
Copy link
Contributor

For what it's worth, Chrome's integrated translator works great.

@edykim
Copy link
Contributor

edykim commented Apr 22, 2020

I had some free time this morning, and I looked into it. The issue is from production-app.js and the router.

Problems

navigate in production-app.js

When the page is opened on the translation service, production-app.js made a redirection to the actual webpage.

if (
pagePath &&
__BASE_PATH__ + pagePath !== browserLoc.pathname &&
!(
loader.findMatchPath(stripPrefix(browserLoc.pathname, __BASE_PATH__)) ||
pagePath === `/404.html` ||
pagePath.match(/^\/404\/?$/) ||
pagePath.match(/^\/offline-plugin-app-shell-fallback\/?$/)
)
) {
navigate(__BASE_PATH__ + pagePath + browserLoc.search + browserLoc.hash, {
replace: true,
})
}

The translation service fetches the webpage from their host so that they can manipulate the content on other pages without the CORS problem. That means the URL structure is changed from our routing rules. navigate is called from here. This is the reason for blinking.

It redirects to /translate_c or something similar URL. The URL is not valid on our website, so a 404 error happens as a result.

Hydration

The other problem happens when the components hydrate it. If you add the translation service into headers, it will hydrate the data again; however, it loads wrong data based on the wrong URL because of the reason above.

If the website doesn't have CORS headers, it will fail to hydrate the page anyway because the loader cannot load page-data.json, actually any resources from the website.

If the website has CORS headers for the translation service, it will hydrate again. However, the router using window.location and... you know, it will fail to load the proper page as we see above.

Solution (but dirty)

This is my solution for this but it is not clean and not ideal. I don't want to specify some translation services in my code so I just added some logic before hitting the navigate in production-app.js.

// gatsby-node.js

const fs = require("fs")
const path = require("path")

exports.onPreBootstrap = ({ store }) => {
  const { program } = store.getState()
  const filePath = path.join(program.directory, ".cache", "production-app.js")

  const code = fs.readFileSync(filePath, {
    encoding: `utf-8`,
  })

  const newCode = code.replace(
    `const { pagePath, location: browserLoc } = window`,
    `const { pagePath } = window
    let { location: browserLoc } = window

    if (window.parent.location !== browserLoc) {
      browserLoc = {
        pathname: pagePath
      }
    }
  `
  )

  fs.writeFileSync(filePath, newCode, `utf-8`)
}

If there is a parent frame, add some stub into browserLoc and avoid navigate calling. If you don't have CORS headers for these services, it doesn't have a problem with hydration too because the code cannot fetch page-data.json from your website.

Obviously, there are downsides to this approach because of missing hydration. Also, it will be a problem if you are using an iframe with the website.

The demo page is here:
https://translate.google.com/translate?sl=ko&tl=en&u=https%3A%2F%2Fxenodochial-swartz-f568e8.netlify.app%2F

Screen Shot 2020-04-22 at 1 00 36 pm


I'd love to see some nice solution for this issue.

@LekoArts
Copy link
Contributor

I've tried it with https://translate.google.com/translate?sl=auto&tl=de&u=https://www.gatsbyjs.com and it works.

Closing as stale. If you still see this behavior in the latest Gatsby v3 please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issue with a clear description that the community can help with. type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

No branches or pull requests