Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webpack should handle loading worker instead of setting workerSrc #10838

Closed
MickL opened this issue May 20, 2019 · 25 comments
Closed

Webpack should handle loading worker instead of setting workerSrc #10838

MickL opened this issue May 20, 2019 · 25 comments
Labels

Comments

@MickL
Copy link

MickL commented May 20, 2019

When following the Webpack example and importing pdfjs-dist with import * as pdfjsLib from 'pdfjs-dist'; Webpack will create a pdfjsWorker.js and also automatically load it in the browser. The file may be named differently(hashed names, prefixes, etc.)

Still pdf.js requires to set a absolute path: pdfjsLib.GlobalWorkerOptions.workerSrc = 'pdfjsWorker.js';

This will let Webpack load the worker, and then pdf.js will also load the worker itself. Why is that the case? Instead we could just create a Worker and let Webpack do the loading?

Also i am having heavy trouble setting the workerSrc: The filename might be pdfjsWorker.js in development, but in production it has hashes and differential loading prefixes. I could use an external worker.js but then the worker(1.5mb) will be loaded twice and should not be included in Webpack bundling.

@Kailaash-Balachandran
Copy link

@MickL I'm facing the same. Did you find any workarounds?

@MickL
Copy link
Author

MickL commented Jun 18, 2019

No I didnt find any. I used a second, precompiled, pdfjsWorker. So as far as I see the 1.5 MB worker gets loaded twice (bundled and loaded with webpack and loaded by pdf.js again but from different source).

If i get this right, when using Webpack, pdf.js should not require a workerSrc and let the dependencies handle by Webpack.

@wpp
Copy link

wpp commented Aug 19, 2019

FWIW you can try importing import pdfjsLib from 'pdfjs-dist/webpack'; which handles the url assignment automatically. It does seem to come with a caveat though, if you're using (a newer version of) create-react-app the hot module replacement doesn't seem to be compatible right now. There is an example project I've found today: https://github.com/yurydelendik/pdfjs-react

@timvandermeij
Copy link
Contributor

Closing since we've changed the way we're working with Webpack to remove those dependencies from pdfjs-dist.

@giampaolo44
Copy link

giampaolo44 commented Aug 22, 2020

@timvandermeij would you mind clarifying how you are working with Webpack now? I looked around quite a bit but could not figure it out by myself.

I got trapped in the error described in #10997, realized that my fully working app stopped working today because it was based on unstable sources, moved from links to an npm install of pdfjs-dist as advised there and tried to follow instructions here to adapt it to my use if Webpack, to no avail. The example provided is based on React, which I am not familiar with, nor using.

My setup is:

  • ES6 modules bundled with Webpack/babel, and Django as the back end. Django is not involved in the use of pdfjs other than providing a base template and reference to the pdf file; also, I am using webpack without Django's html-webpack-plugin (even if installed) because I seem to have an easier life compiling assets directly where Django expects them;

  • package.json relevant section:

      "devDependencies": {
          "@babel/core": "^7.10.4",
          "@babel/preset-env": "^7.10.4",
          "babel-loader": "^8.1.0",
          "html-webpack-plugin": "^4.3.0",
          "webpack": "^4.43.0",
          "webpack-bundle-tracker": "^1.0.0-alpha.1",
          "webpack-cli": "^3.3.12",
          "webpack-dev-server": "^3.11.0",
          "worker-loader": "^3.0.2"
        },
        "dependencies": {
          "bootstrap-icons": "^1.0.0-alpha5",
          "npm": "^6.14.8",
          "pdfjs-dist": "^2.4.456"
        }
    

As far as I understand, I need only to figure out how to import properly pdfjs and set the worker up, i.e. a couple of lines of code (see in code):

import pdfjsLib from 'pdfjs-dist/webpack' //  <--- unsure about this  [line 1]
// import Worker from 'worker-loader!./Worker.js'; // this should not be necessary AFAIU

////////////////////////////////////////////
//// instantiate pdf
export const pdfView = () => {

  pdfjsLib.GlobalWorkerOptions.workerSrc = '../../node_modules/pdfjs-dist/build/pdf.worker.js';
  // ^^ [ line 2 ] this gets interpreted as a web address rather that an abs address in my src/ folder

  // defined through Django template tag in select.html
  const loadingTask = pdfjsLib.getDocument(pdfData.myPdfDoc)

  pdfData.myPdf = loadingTask.promise.then(pdf => {
    pdfData.pdfTotalPageN = pdf.numPages;
    return pdf;
  })
}

Please let me know if you want me to open a new bug or if you can provide the required two lines of code, references or applicable examples in this thread.
Thanks in advance

@timvandermeij
Copy link
Contributor

timvandermeij commented Aug 22, 2020

[...] would you mind clarifying how you are working with Webpack now?

We tried to isolate the Webpack logic into this example so that it's self-contained and no other parts of PDF.js require its dependencies, also because we try to focus on the library itself and not on integration with the various JS frameworks. We're not familiar with them and in general can't answer questions about them; the examples are merely provided as a starting point.

There is an additional example at https://github.com/yurydelendik/pdfjs-react/blob/4deabd1165395821acd4b6d3bc05dd6fef19b97f/src/App.js#L6 that seems to indicate that you're using it correctly. You should indeed also set the workerSrc option.

[...] to no avail

It's not clear what is actually not working because no running example has been provided. This makes it not possible to know what's going on.

@giampaolo44
Copy link

Thanks a lot for your prompt answer @timvandermeij

There is an additional example at https://github.com/yurydelendik/pdfjs-react/blob/4deabd1165395821acd4b6d3bc05dd6fef19b97f/src/App.js#L6 that seems to indicate that you're using it correctly. You should indeed also set the workerSrc option.

Also the linked example is a React setup and I am not sure if/how this influences the results.
What I noticed is that there does not seem to be a setting of the workerSrc option. I searched the term also in the rest of the repo and did not find any line of code instantiating it. Which could be coherent with some instructions I remembered reading in the process (I could not find them again, alas) that were mentioning that there is no need to instantiate or configure the worker as long as it is installed in the same bundle (pdfjs-dist).

It's not clear what is actually not working because no running example has been provided. This makes it not possible to know what's going on.

Let me try adding a few more info that might give hints, maybe the interpretation is obvious to you:

  • I changed the import statement as in the example provided, like so:
import pdfjsLib from 'pdfjs-dist/webpack'

////////////////////////////////////////////
//// instantiate pdf
export const pdfView = () => {
  logDebug(module.id.split('/').slice(-1)[0], ['pdfView initialized']);
  // pdfjsLib.GlobalWorkerOptions.workerSrc = '../../node_modules/pdfjs-dist/build/pdf.worker.js';

  // defined through Django template tag in select.html
  const loadingTask = pdfjsLib.getDocument(pdfData.myPdfDoc)

  pdfData.myPdf = loadingTask.promise.then(pdf => {
    pdfData.pdfTotalPageN = pdf.numPages;
    return pdf;
  })
}
  • This is the feedback from Webpack attempting to compile the resources:

WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileAsyncWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./src/js/views/pdfViews.js
 @ ./src/js/index.js

WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./src/js/views/pdfViews.js
 @ ./src/js/index.js

ERROR in (webpack)/lib/node/NodeTargetPlugin.js
Module not found: Error: Can't resolve 'module' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/webpack/lib/node'
 @ (webpack)/lib/node/NodeTargetPlugin.js 11:1-18
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./src/js/views/pdfViews.js
 @ ./src/js/index.js
Child HtmlWebpackCompiler:
     1 asset
    Entrypoint HtmlWebpackPlugin_0 = __child-HtmlWebpackPlugin_0
    [./node_modules/html-webpack-plugin/lib/loader.js!./src/src-select.html] 4.57 KiB {HtmlWebpackPlugin_0} [built]

I looked at my node_modules directories, and:

Warnings 1. and 2. -- 'FetchCompileAsyncWasmPlugin.js' is not in my node_modules/webpack/lib/web/ directory, although there's a 'FetchCompileWasmTemplatePlugin.js'
Error 3. -- No module named 'module' in node_modules/webpack/lib/web/ either.

One thing I asked myself is: is there a need to perform some post npm install actions (I remember seeing a gust command or similar around, but could not find that instruction back either) that might generate the missing resources?

Thanks again

@timvandermeij
Copy link
Contributor

What I noticed is that there does not seem to be a setting of the workerSrc option. I searched the term also in the rest of the repo and did not find any line of code instantiating it.

The examples all set them, see https://github.com/mozilla/pdf.js/search?q=workerSrc&unscoped_q=workerSrc, even the Webpack example at https://github.com/mozilla/pdf.js/blob/50bc4a18e8c564753365d927d5ec6a6d2cce3072/examples/webpack/main.js, so I'm not sure why it was not found.

Moreover, if I look at the error log, all errors seem to originate from somewhere inside Webpack and worker-loader, and seem completely unrelated to PDF.js. FetchCompileAsyncWasmPlugin is not something that is in the PDF.js codebase at all. I have the feeling that the root cause of the errors is not PDF.js, but something else in your project, but that's impossible to tell for us unfortunately.

@giampaolo44
Copy link

giampaolo44 commented Aug 23, 2020

Moreover, if I look at the error log, all errors seem to originate from somewhere inside Webpack and worker-loader, and seem completely unrelated to PDF.js. FetchCompileAsyncWasmPlugin is not something that is in the PDF.js codebase at all. I have the feeling that the root cause of the errors is not PDF.js, but something else in your project, but that's impossible to tell for us unfortunately.

I suspect you are right.

Let me try one more time to bother you, and in case it doesn't work I promise I'll stop.

Looking at the Webpack example you linked, I found they instantiate the worker like this:

var pdfPath = "../learning/helloworld.pdf";

// Setting worker path to worker bundle.
pdfjsLib.GlobalWorkerOptions.workerSrc =
  "../../build/webpack/pdf.worker.bundle.js";

I don't have the build/webpack dirs because I make Webpack compile directly in my Django directories (something that looks like KJ_import/static/docs/bundles/).
In there I see this as the output:

index.js
index.worker.js

and looking inside the compiled index.js resource I get a

"use strict";
eval("__webpack_require__.r(__webpack_exports__);\n/* harmony default export */ __webpack_exports__[\"default\"] = (function() {\n  return new Worker(__webpack_require__.p + \"index.worker.js\");\n});\n\n\n//# sourceURL=webpack:///./node_modules/pdfjs-dist/build/pdf.worker.js?./node_modules/worker-loader/dist/cjs.js");

section that calls the index.worker.js module.

Do you see a way I could amend the ../../build/webpack/pdf.worker.bundle.js path to make it usable in my case? Assuming the example was referencing directly the resource after it had been built I tried pdfjsLib.GlobalWorkerOptions.workerSrc = 'index.worker.js', but the result does not seem to change much:

WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileAsyncWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./src/js/views/pdfViews.js
 @ ./src/js/index.js

WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./src/js/views/pdfViews.js
 @ ./src/js/index.js

ERROR in (webpack)/lib/node/NodeTargetPlugin.js
Module not found: Error: Can't resolve 'module' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/webpack/lib/node'
 @ (webpack)/lib/node/NodeTargetPlugin.js 11:1-18
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./src/js/views/pdfViews.js
 @ ./src/js/index.js
Child worker-loader node_modules/pdfjs-dist/build/pdf.worker.js:
     1 asset
    Entrypoint pdf.worker = index.worker.js
       2 modules
ℹ 「wdm」: Failed to compile.

Anyway: thanks a million again for your answers and their speed, even on a Sunday.

@timvandermeij
Copy link
Contributor

Do you see a way I could amend the ../../build/webpack/pdf.worker.bundle.js path to make it usable in my case?

That path is indeed only valid for the example itself when the steps from the README at https://github.com/mozilla/pdf.js/blob/master/examples/webpack/README.md are followed. The gulp dist-install line makes that work.

If you use pdfjs-dist you don't need that since the required Webpack bits are distributed along with it as outlined in https://github.com/mozilla/pdf.js/blob/master/examples/webpack/README.md#worker-loading. Looking at that in more detail, you indeed shouldn't have to set the workerSrc at all because the zero-configuration Webpack file already does that for you; see https://github.com/mozilla/pdfjs-dist/blob/master/webpack.js#L27-L31 (this is distributed in pdfjs-dist).

@giampaolo44
Copy link

Fantastic, thank you. You found the same resource I had been reading in my searches.
I still have to figure out what's not working but you helped me ruling out quite a few bits.
Kindest regards,
Giampaolo

@edcheung1
Copy link

Fantastic, thank you. You found the same resource I had been reading in my searches.
I still have to figure out what's not working but you helped me ruling out quite a few bits.
Kindest regards,
Giampaolo

Hey Giampaolo,

I'm also running into the same issue. I haven't been able to completely resolve the issue, but I was able to see that the worker-loader is trying to require the FetchCompileWasmPlugin here:

https://github.com/webpack-contrib/worker-loader/blob/master/src/index.js#L26

Seems like there may be some inconsistencies between Webpack 4 and 5?

@giampaolo44
Copy link

Hey Giampaolo,

I'm also running into the same issue. I haven't been able to completely resolve the issue, but I was able to see that the worker-loader is trying to require the FetchCompileWasmPlugin here:

https://github.com/webpack-contrib/worker-loader/blob/master/src/index.js#L26

Seems like there may be some inconsistencies between Webpack 4 and 5?

Oh great pick @edcheung1 . Do you think we should raise the issue with the Webpack team and maybe open an issue?
Looking at their website I understood to ask on SO first, which I did without feedback so far, so it might be a good idea.

@rettgerst
Copy link

I'm running into the same problem here. admittedly this is a super old project with a lot of out-of-date dependencies so maybe I'm missing something, but I can't get a newer version of pdfjs-dist to work where previously I had it working by importing 'pdfjs-dist/webpack'. now after updating pdfjs-dist, worker-loader and webpack, I am getting the following output:

WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileAsyncWasmPlugin' in '/home/rett/projects/LSCPortalFE/node_modules/worker-loader/dist'
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./app/scripts/modules/PDFJSTools.ts
 @ ./app/scripts/UploadModalCtrl.ts
 @ ./app/scripts/angular-scripts.js
 @ multi (webpack)-dev-server/client?http://localhost:9000 @babel/polyfill ./app/scripts/deps.js ./app/scripts/angular-scripts.js ./app/scripts/stylesheet-bundle.js

WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileWasmPlugin' in '/home/rett/projects/LSCPortalFE/node_modules/worker-loader/dist'
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./app/scripts/modules/PDFJSTools.ts
 @ ./app/scripts/UploadModalCtrl.ts
 @ ./app/scripts/angular-scripts.js
 @ multi (webpack)-dev-server/client?http://localhost:9000 @babel/polyfill ./app/scripts/deps.js ./app/scripts/angular-scripts.js ./app/scripts/stylesheet-bundle.js

ERROR in (webpack)/lib/node/NodeTargetPlugin.js
Module not found: Error: Can't resolve 'module' in '/home/rett/projects/LSCPortalFE/node_modules/webpack/lib/node'
 @ (webpack)/lib/node/NodeTargetPlugin.js 11:1-18
 @ ./node_modules/worker-loader/dist/index.js
 @ ./node_modules/worker-loader/dist/cjs.js
 @ ./node_modules/pdfjs-dist/webpack.js
 @ ./app/scripts/modules/PDFJSTools.ts
 @ ./app/scripts/UploadModalCtrl.ts
 @ ./app/scripts/angular-scripts.js
 @ multi (webpack)-dev-server/client?http://localhost:9000 @babel/polyfill ./app/scripts/deps.js ./app/scripts/angular-scripts.js ./app/scripts/stylesheet-bundle.js

@achembarpu
Copy link

+1 FWIW facing the exact same error msgs mentioned above with a clean setup of create-react-app and using the App component from https://github.com/yurydelendik/pdfjs-react.

@mforman1
Copy link

mforman1 commented Sep 8, 2020

We are also facing the exact same issue after upgrading the pdfjs library,
@timvandermeij can the issue be re-opened?

@Mageenz
Copy link

Mageenz commented Sep 28, 2020

vue-cli4,same error

@tschtt
Copy link

tschtt commented Sep 28, 2020

I was able to solve the problem with the responses on
https://stackoverflow.com/questions/63553008/looking-for-help-to-make-npm-pdfjs-dist-work-with-webpack-and-django

Nevertheless, as i was still facing other issues using pdfjs with vue 3... To make it work i ended up using version 2.0.943 of the pdfjs-dist package. Not the best solution, but the only way i found to make it work after a week of trial and error...

@jixbo
Copy link

jixbo commented Oct 5, 2020

+1 FWIW facing the exact same error msgs mentioned above with a clean setup of create-react-app and using the App component from https://github.com/yurydelendik/pdfjs-react.

Currently facing the exact same issue. Did you find a fix?

@aert
Copy link

aert commented Oct 10, 2020

Same issue here, I followed all steps described in the sparse docs/examples and countless online blog post for old version and still no clear way to integrate pdfjs to a project.

This post shows all the hoops one went through to make it work: https://stackoverflow.com/questions/63553008/looking-for-help-to-make-npm-pdfjs-dist-work-with-webpack-and-django

Would really be nice if the dev experience was nicer, in the mean time will try to downgrade to 2.0.943 as the post above suggests ...

@drdrwhite
Copy link

Here's the fix copied from SO, posted by Siddhesh on 20th October:

This issue seems to arise due to esModule option introduced in worker-loader@3.0.0.
The fix for this was merged in (pre-release) pdjs-dist@2.6.347
You can fix this by either upgrading pdfjs-dist to v2.6.347 OR downgrading worker-loader to v2.0.0

It's easiest to downgrade worker-loader, as the pdfjs-dist containing the fix has not yet been released to npm.

You can then import pdfjs-dist with:

let pdfjs = require("pdfjs-dist/webpack");
let loadingTask = pdfjs.getDocument(url);     

This works for me within a Vue.js component, in a project created by vue-cli. I'm using pdfjs-dist 2.5.207 and worker-loader 2.0.0.

howardh added a commit to howardh/web-pdf-annotator that referenced this issue Oct 27, 2020
- Doesn't work with worker-loader 3.0.5. Had to downgrade to 2.0.0.
  - See mozilla/pdf.js#10838
@morgan4080
Copy link

Here is how I imported everything pdfjs and pdfjsworker

import * as pdfjsLib from 'pdfjs-dist';
import { pdfjsWorker } from 'pdfjs-dist/webpack'

@reynard80
Copy link

Here is how I imported everything pdfjs and pdfjsworker

import * as pdfjsLib from 'pdfjs-dist'; import { pdfjsWorker } from 'pdfjs-dist/webpack'

This finally worked for me after trying a lot. Thanks a lot.

@rnike
Copy link

rnike commented Mar 3, 2023

thanks @morgan4080 the solution works pretty well

below is for someone who is looking for dynamic import

import type PDFJS from 'pdfjs-dist';

let loader: Promise<typeof PDFJS>;

export default async function pdfjs() {
  if (!loader) {
    loader = new Promise((resolve, reject) => {
      import('pdfjs-dist/webpack')
        .then(() => import('pdfjs-dist/build/pdf').then(resolve))
        .catch(reject);
    });
  }

  return loader;
}

@MddHstl
Copy link

MddHstl commented Aug 10, 2023

Just leave the final solution here.

import * as pdfjsLib from 'pdfjs-dist/webpack';

export const pdfView = () => {
  const loadingTask = pdfjsLib.getDocument(pdfData.myPdfDoc)

  pdfData.myPdf = loadingTask.promise.then(pdf => {
    pdfData.pdfTotalPageN = pdf.numPages;
    return pdf;
  })
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests